Re: [cephadm] mgr: no daemons active

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hmm, okay. It seems like cephadm is stuck in general rather than an issue
specific to the upgrade. I'd first make sure the orchestrator isn't paused
(just running "ceph orch resume" should be enough, it's idempotent).

Beyond that, there was someone else who had an issue with things getting
stuck that was resolved in this thread
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/NKKLV5TMHFA3ERGCMJ3M7BVLA5PGIR4M/#NKKLV5TMHFA3ERGCMJ3M7BVLA5PGIR4M
<https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/NKKLV5TMHFA3ERGCMJ3M7BVLA5PGIR4M/#NKKLV5TMHFA3ERGCMJ3M7BVLA5PGIR4M>
that
might be worth a look.

If you haven't already, it's possible stopping the upgrade is a good idea,
as maybe that's interfering with it getting to the point where it does the
redeploy.

If none of those help, it might be worth setting the log level to debug and
seeing where things are ending up ("ceph config set mgr
mgr/cephadm/log_to_cluster_level debug; ceph orch ps --refresh" then
waiting a few minutes before running "ceph log last 100 debug cephadm" (not
100% on format of that command, if it fails try just "ceph log last
cephadm"). We could maybe get more info on why it's not performing the
redeploy from those debug logs. Just remember to set the log level back
after 'ceph config set mgr mgr/cephadm/log_to_cluster_level info' as debug
logs are quite verbose.

On Fri, Sep 2, 2022 at 11:39 AM Satish Patel <satish.txt@xxxxxxxxx> wrote:

> Hi Adam,
>
> As you said, i did following
>
> $ ceph orch daemon redeploy mgr.ceph1.smfvfd  quay.io/ceph/ceph:v16.2.10
>
> Noticed following line in logs but then no activity nothing, still standby
> mgr running in older version
>
> 2022-09-02T15:35:45.753093+0000 mgr.ceph2.huidoh (mgr.344392) 2226 :
> cephadm [INF] Schedule redeploy daemon mgr.ceph1.smfvfd
> 2022-09-02T15:36:17.279190+0000 mgr.ceph2.huidoh (mgr.344392) 2245 :
> cephadm [INF] refreshing ceph2 facts
> 2022-09-02T15:36:17.984478+0000 mgr.ceph2.huidoh (mgr.344392) 2246 :
> cephadm [INF] refreshing ceph1 facts
> 2022-09-02T15:37:17.663730+0000 mgr.ceph2.huidoh (mgr.344392) 2284 :
> cephadm [INF] refreshing ceph2 facts
> 2022-09-02T15:37:18.386586+0000 mgr.ceph2.huidoh (mgr.344392) 2285 :
> cephadm [INF] refreshing ceph1 facts
>
> I am not seeing any image get downloaded also
>
> root@ceph1:~# docker image ls
> REPOSITORY                         TAG       IMAGE ID       CREATED
>   SIZE
> quay.io/ceph/ceph                  v15       93146564743f   3 weeks ago
>   1.2GB
> quay.io/ceph/ceph-grafana          8.3.5     dad864ee21e9   4 months ago
>    558MB
> quay.io/prometheus/prometheus      v2.33.4   514e6a882f6e   6 months ago
>    204MB
> quay.io/prometheus/alertmanager    v0.23.0   ba2b418f427c   12 months ago
>   57.5MB
> quay.io/ceph/ceph-grafana          6.7.4     557c83e11646   13 months ago
>   486MB
> quay.io/prometheus/prometheus      v2.18.1   de242295e225   2 years ago
>   140MB
> quay.io/prometheus/alertmanager    v0.20.0   0881eb8f169f   2 years ago
>   52.1MB
> quay.io/prometheus/node-exporter   v0.18.1   e5a616e4b9cf   3 years ago
>   22.9MB
>
>
> On Fri, Sep 2, 2022 at 11:06 AM Adam King <adking@xxxxxxxxxx> wrote:
>
>> hmm, at this point, maybe we should just try manually upgrading the mgr
>> daemons and then move from there. First, just stop the upgrade "ceph orch
>> upgrade stop". If you figure out which of the two mgr daemons is the
>> standby (it should say which one is active in "ceph -s" output) and then do
>> a "ceph orch daemon redeploy <standby-mgr-name>
>> quay.io/ceph/ceph:v16.2.10" it should redeploy that specific mgr with
>> the new version. You could then do a "ceph mgr fail" to swap which of the
>> mgr daemons is active, then do another "ceph orch daemon redeploy
>> <standby-mgr-name> quay.io/ceph/ceph:v16.2.10" where the standby is now
>> the other mgr still on 15.2.17. Once the mgr daemons are both upgraded to
>> the new version, run a "ceph orch redeploy mgr" and then "ceph orch upgrade
>> start --image quay.io/ceph/ceph:v16.2.10" and see if it goes better.
>>
>> On Fri, Sep 2, 2022 at 10:36 AM Satish Patel <satish.txt@xxxxxxxxx>
>> wrote:
>>
>>> Hi Adam,
>>>
>>> I run the following command to upgrade but it looks like nothing is
>>> happening
>>>
>>> $ ceph orch upgrade start --image quay.io/ceph/ceph:v16.2.10
>>>
>>> Status message is empty..
>>>
>>> root@ceph1:~# ceph orch upgrade status
>>> {
>>>     "target_image": "quay.io/ceph/ceph:v16.2.10",
>>>     "in_progress": true,
>>>     "services_complete": [],
>>>     "message": ""
>>> }
>>>
>>> Nothing in Logs
>>>
>>> root@ceph1:~# tail -f
>>> /var/log/ceph/f270ad9e-1f6f-11ed-b6f8-a539d87379ea/ceph.cephadm.log
>>> 2022-09-02T14:31:52.597661+0000 mgr.ceph2.huidoh (mgr.344392) 174 :
>>> cephadm [INF] refreshing ceph2 facts
>>> 2022-09-02T14:31:52.991450+0000 mgr.ceph2.huidoh (mgr.344392) 176 :
>>> cephadm [INF] refreshing ceph1 facts
>>> 2022-09-02T14:32:52.965092+0000 mgr.ceph2.huidoh (mgr.344392) 207 :
>>> cephadm [INF] refreshing ceph2 facts
>>> 2022-09-02T14:32:53.369789+0000 mgr.ceph2.huidoh (mgr.344392) 208 :
>>> cephadm [INF] refreshing ceph1 facts
>>> 2022-09-02T14:33:53.367986+0000 mgr.ceph2.huidoh (mgr.344392) 239 :
>>> cephadm [INF] refreshing ceph2 facts
>>> 2022-09-02T14:33:53.760427+0000 mgr.ceph2.huidoh (mgr.344392) 240 :
>>> cephadm [INF] refreshing ceph1 facts
>>> 2022-09-02T14:34:53.754277+0000 mgr.ceph2.huidoh (mgr.344392) 272 :
>>> cephadm [INF] refreshing ceph2 facts
>>> 2022-09-02T14:34:54.162503+0000 mgr.ceph2.huidoh (mgr.344392) 273 :
>>> cephadm [INF] refreshing ceph1 facts
>>> 2022-09-02T14:35:54.133467+0000 mgr.ceph2.huidoh (mgr.344392) 305 :
>>> cephadm [INF] refreshing ceph2 facts
>>> 2022-09-02T14:35:54.522171+0000 mgr.ceph2.huidoh (mgr.344392) 306 :
>>> cephadm [INF] refreshing ceph1 facts
>>>
>>> In progress that mesg stuck there for long time
>>>
>>> root@ceph1:~# ceph -s
>>>   cluster:
>>>     id:     f270ad9e-1f6f-11ed-b6f8-a539d87379ea
>>>     health: HEALTH_OK
>>>
>>>   services:
>>>     mon: 1 daemons, quorum ceph1 (age 9h)
>>>     mgr: ceph2.huidoh(active, since 9m), standbys: ceph1.smfvfd
>>>     osd: 4 osds: 4 up (since 9h), 4 in (since 11h)
>>>
>>>   data:
>>>     pools:   5 pools, 129 pgs
>>>     objects: 20.06k objects, 83 GiB
>>>     usage:   168 GiB used, 632 GiB / 800 GiB avail
>>>     pgs:     129 active+clean
>>>
>>>   io:
>>>     client:   12 KiB/s wr, 0 op/s rd, 1 op/s wr
>>>
>>>   progress:
>>>     Upgrade to quay.io/ceph/ceph:v16.2.10 (0s)
>>>       [............................]
>>>
>>> On Fri, Sep 2, 2022 at 10:25 AM Satish Patel <satish.txt@xxxxxxxxx>
>>> wrote:
>>>
>>>> It Looks like I did it with the following command.
>>>>
>>>> $ ceph orch daemon add mgr ceph2:10.73.0.192
>>>>
>>>> Now i can see two with same version 15.x
>>>>
>>>> root@ceph1:~# ceph orch ps --daemon-type mgr
>>>> NAME              HOST   STATUS         REFRESHED  AGE  VERSION  IMAGE
>>>> NAME
>>>>           IMAGE ID      CONTAINER ID
>>>> mgr.ceph1.smfvfd  ceph1  running (8h)   41s ago    8h   15.2.17
>>>> quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
>>>>  93146564743f  1aab837306d2
>>>> mgr.ceph2.huidoh  ceph2  running (60s)  110s ago   60s  15.2.17
>>>> quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
>>>>  93146564743f  294fd6ab6c97
>>>>
>>>> On Fri, Sep 2, 2022 at 10:19 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> Let's come back to the original question: how to bring back the second
>>>>> mgr?
>>>>>
>>>>> root@ceph1:~# ceph orch apply mgr 2
>>>>> Scheduled mgr update...
>>>>>
>>>>> Nothing happened with above command, logs saying nothing
>>>>>
>>>>> 2022-09-02T14:16:20.407927+0000 mgr.ceph1.smfvfd (mgr.334626) 16939 :
>>>>> cephadm [INF] refreshing ceph2 facts
>>>>> 2022-09-02T14:16:40.247195+0000 mgr.ceph1.smfvfd (mgr.334626) 16952 :
>>>>> cephadm [INF] Saving service mgr spec with placement count:2
>>>>> 2022-09-02T14:16:53.106919+0000 mgr.ceph1.smfvfd (mgr.334626) 16961 :
>>>>> cephadm [INF] Saving service mgr spec with placement count:2
>>>>> 2022-09-02T14:17:19.135203+0000 mgr.ceph1.smfvfd (mgr.334626) 16975 :
>>>>> cephadm [INF] refreshing ceph1 facts
>>>>> 2022-09-02T14:17:20.780496+0000 mgr.ceph1.smfvfd (mgr.334626) 16977 :
>>>>> cephadm [INF] refreshing ceph2 facts
>>>>> 2022-09-02T14:18:19.502034+0000 mgr.ceph1.smfvfd (mgr.334626) 17008 :
>>>>> cephadm [INF] refreshing ceph1 facts
>>>>> 2022-09-02T14:18:21.127973+0000 mgr.ceph1.smfvfd (mgr.334626) 17010 :
>>>>> cephadm [INF] refreshing ceph2 facts
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 2, 2022 at 10:15 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> Hi Adam,
>>>>>>
>>>>>> Wait..wait.. now it's working suddenly without doing anything.. very
>>>>>> odd
>>>>>>
>>>>>> root@ceph1:~# ceph orch ls
>>>>>> NAME                  RUNNING  REFRESHED  AGE  PLACEMENT    IMAGE
>>>>>> NAME
>>>>>>           IMAGE ID
>>>>>> alertmanager              1/1  5s ago     2w   count:1
>>>>>> quay.io/prometheus/alertmanager:v0.20.0
>>>>>>                        0881eb8f169f
>>>>>> crash                     2/2  5s ago     2w   *
>>>>>> quay.io/ceph/ceph:v15
>>>>>>                        93146564743f
>>>>>> grafana                   1/1  5s ago     2w   count:1
>>>>>> quay.io/ceph/ceph-grafana:6.7.4
>>>>>>                        557c83e11646
>>>>>> mgr                       1/2  5s ago     8h   count:2
>>>>>> quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
>>>>>>  93146564743f
>>>>>> mon                       1/2  5s ago     8h   ceph1;ceph2
>>>>>> quay.io/ceph/ceph:v15
>>>>>>                        93146564743f
>>>>>> node-exporter             2/2  5s ago     2w   *
>>>>>> quay.io/prometheus/node-exporter:v0.18.1
>>>>>>                       e5a616e4b9cf
>>>>>> osd.osd_spec_default      4/0  5s ago     -    <unmanaged>
>>>>>> quay.io/ceph/ceph:v15
>>>>>>                        93146564743f
>>>>>> prometheus                1/1  5s ago     2w   count:1
>>>>>> quay.io/prometheus/prometheus:v2.18.1
>>>>>>
>>>>>> On Fri, Sep 2, 2022 at 10:13 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>>>> wrote:
>>>>>>
>>>>>>> I can see that in the output but I'm not sure how to get rid of it.
>>>>>>>
>>>>>>> root@ceph1:~# ceph orch ps --refresh
>>>>>>> NAME
>>>>>>>      HOST   STATUS        REFRESHED  AGE  VERSION    IMAGE NAME
>>>>>>>                                                                     IMAGE
>>>>>>> ID      CONTAINER ID
>>>>>>> alertmanager.ceph1
>>>>>>>      ceph1  running (9h)  64s ago    2w   0.20.0
>>>>>>> quay.io/prometheus/alertmanager:v0.20.0
>>>>>>>                        0881eb8f169f  ba804b555378
>>>>>>> cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>>>>>>>  ceph2  stopped       65s ago    -    <unknown>  <unknown>
>>>>>>>                                                                  <unknown>
>>>>>>>     <unknown>
>>>>>>> crash.ceph1
>>>>>>>       ceph1  running (9h)  64s ago    2w   15.2.17
>>>>>>> quay.io/ceph/ceph:v15
>>>>>>>                        93146564743f  a3a431d834fc
>>>>>>> crash.ceph2
>>>>>>>       ceph2  running (9h)  65s ago    13d  15.2.17
>>>>>>> quay.io/ceph/ceph:v15
>>>>>>>                        93146564743f  3c963693ff2b
>>>>>>> grafana.ceph1
>>>>>>>       ceph1  running (9h)  64s ago    2w   6.7.4
>>>>>>> quay.io/ceph/ceph-grafana:6.7.4
>>>>>>>                        557c83e11646  7583a8dc4c61
>>>>>>> mgr.ceph1.smfvfd
>>>>>>>      ceph1  running (8h)  64s ago    8h   15.2.17
>>>>>>> quay.io/ceph/ceph@sha256:c08064dde4bba4e72a1f55d90ca32df9ef5aafab82efe2e0a0722444a5aaacca
>>>>>>>  93146564743f  1aab837306d2
>>>>>>> mon.ceph1
>>>>>>>       ceph1  running (9h)  64s ago    2w   15.2.17
>>>>>>> quay.io/ceph/ceph:v15
>>>>>>>                        93146564743f  c1d155d8c7ad
>>>>>>> node-exporter.ceph1
>>>>>>>       ceph1  running (9h)  64s ago    2w   0.18.1
>>>>>>> quay.io/prometheus/node-exporter:v0.18.1
>>>>>>>                         e5a616e4b9cf  2ff235fe0e42
>>>>>>> node-exporter.ceph2
>>>>>>>       ceph2  running (9h)  65s ago    13d  0.18.1
>>>>>>> quay.io/prometheus/node-exporter:v0.18.1
>>>>>>>                         e5a616e4b9cf  17678b9ba602
>>>>>>> osd.0
>>>>>>>       ceph1  running (9h)  64s ago    13d  15.2.17
>>>>>>> quay.io/ceph/ceph:v15
>>>>>>>                        93146564743f  d0fd73b777a3
>>>>>>> osd.1
>>>>>>>       ceph1  running (9h)  64s ago    13d  15.2.17
>>>>>>> quay.io/ceph/ceph:v15
>>>>>>>                        93146564743f  049120e83102
>>>>>>> osd.2
>>>>>>>       ceph2  running (9h)  65s ago    13d  15.2.17
>>>>>>> quay.io/ceph/ceph:v15
>>>>>>>                        93146564743f  8700e8cefd1f
>>>>>>> osd.3
>>>>>>>       ceph2  running (9h)  65s ago    13d  15.2.17
>>>>>>> quay.io/ceph/ceph:v15
>>>>>>>                        93146564743f  9c71bc87ed16
>>>>>>> prometheus.ceph1
>>>>>>>      ceph1  running (9h)  64s ago    2w   2.18.1
>>>>>>> quay.io/prometheus/prometheus:v2.18.1
>>>>>>>                        de242295e225  74a538efd61e
>>>>>>>
>>>>>>> On Fri, Sep 2, 2022 at 10:10 AM Adam King <adking@xxxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> maybe also a "ceph orch ps --refresh"? It might still have the old
>>>>>>>> cached daemon inventory from before you remove the files.
>>>>>>>>
>>>>>>>> On Fri, Sep 2, 2022 at 9:57 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Adam,
>>>>>>>>>
>>>>>>>>> I have deleted file located here - rm
>>>>>>>>> /var/lib/ceph/f270ad9e-1f6f-11ed-b6f8-a539d87379ea/cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>>>>>>>>>
>>>>>>>>> But still getting the same error, do i need to do anything else?
>>>>>>>>>
>>>>>>>>> On Fri, Sep 2, 2022 at 9:51 AM Adam King <adking@xxxxxxxxxx>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Okay, I'm wondering if this is an issue with version mismatch.
>>>>>>>>>> Having previously had a 16.2.10 mgr and then now having a 15.2.17 one that
>>>>>>>>>> doesn't expect this sort of thing to be present. Either way, I'd think just
>>>>>>>>>> deleting this cephadm.7ce656a8721deb5054c37b0cfb9038
>>>>>>>>>> 1522d521dde51fb0c5a2142314d663f63d (and any others like it) file
>>>>>>>>>> would be the way forward to get orch ls working again.
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 2, 2022 at 9:44 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Adam,
>>>>>>>>>>>
>>>>>>>>>>> In cephadm ls i found the following service but i believe it was
>>>>>>>>>>> there before also.
>>>>>>>>>>>
>>>>>>>>>>> {
>>>>>>>>>>>         "style": "cephadm:v1",
>>>>>>>>>>>         "name":
>>>>>>>>>>> "cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d",
>>>>>>>>>>>         "fsid": "f270ad9e-1f6f-11ed-b6f8-a539d87379ea",
>>>>>>>>>>>         "systemd_unit":
>>>>>>>>>>> "ceph-f270ad9e-1f6f-11ed-b6f8-a539d87379ea@cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>>>>>>>>>>> ",
>>>>>>>>>>>         "enabled": false,
>>>>>>>>>>>         "state": "stopped",
>>>>>>>>>>>         "container_id": null,
>>>>>>>>>>>         "container_image_name": null,
>>>>>>>>>>>         "container_image_id": null,
>>>>>>>>>>>         "version": null,
>>>>>>>>>>>         "started": null,
>>>>>>>>>>>         "created": null,
>>>>>>>>>>>         "deployed": null,
>>>>>>>>>>>         "configured": null
>>>>>>>>>>>     },
>>>>>>>>>>>
>>>>>>>>>>> Look like remove didn't work
>>>>>>>>>>>
>>>>>>>>>>> root@ceph1:~# ceph orch rm cephadm
>>>>>>>>>>> Failed to remove service. <cephadm> was not found.
>>>>>>>>>>>
>>>>>>>>>>> root@ceph1:~# ceph orch rm
>>>>>>>>>>> cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>>>>>>>>>>> Failed to remove service.
>>>>>>>>>>> <cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d>
>>>>>>>>>>> was not found.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Sep 2, 2022 at 8:27 AM Adam King <adking@xxxxxxxxxx>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> this looks like an old traceback you would get if you ended up
>>>>>>>>>>>> with a service type that shouldn't be there somehow. The things I'd
>>>>>>>>>>>> probably check are that "cephadm ls" on either host definitely doesn't
>>>>>>>>>>>> report and strange things that aren't actually daemons in your cluster such
>>>>>>>>>>>> as "cephadm.<hash>". Another thing you could maybe try, as I believe the
>>>>>>>>>>>> assertion it's giving is for an unknown service type here ("AssertionError:
>>>>>>>>>>>> cephadm"), is just "ceph orch rm cephadm" which would maybe cause it to
>>>>>>>>>>>> remove whatever it thinks is this "cephadm" service that it has deployed.
>>>>>>>>>>>> Lastly, you could try having the mgr you manually deploy be a 16.2.10 one
>>>>>>>>>>>> instead of 15.2.17 (I'm assuming here, but the line numbers in that
>>>>>>>>>>>> traceback suggest octopus). The 16.2.10 one is just much less likely to
>>>>>>>>>>>> have a bug that causes something like this.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Sep 2, 2022 at 1:41 AM Satish Patel <
>>>>>>>>>>>> satish.txt@xxxxxxxxx> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Now when I run "ceph orch ps" it works but the following
>>>>>>>>>>>>> command throws an
>>>>>>>>>>>>> error.  Trying to bring up second mgr using ceph orch apply
>>>>>>>>>>>>> mgr command but
>>>>>>>>>>>>> didn't help
>>>>>>>>>>>>>
>>>>>>>>>>>>> root@ceph1:/ceph-disk# ceph version
>>>>>>>>>>>>> ceph version 15.2.17
>>>>>>>>>>>>> (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus
>>>>>>>>>>>>> (stable)
>>>>>>>>>>>>>
>>>>>>>>>>>>> root@ceph1:/ceph-disk# ceph orch ls
>>>>>>>>>>>>> Error EINVAL: Traceback (most recent call last):
>>>>>>>>>>>>>   File "/usr/share/ceph/mgr/mgr_module.py", line 1212, in
>>>>>>>>>>>>> _handle_command
>>>>>>>>>>>>>     return self.handle_command(inbuf, cmd)
>>>>>>>>>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line
>>>>>>>>>>>>> 140, in
>>>>>>>>>>>>> handle_command
>>>>>>>>>>>>>     return dispatch[cmd['prefix']].call(self, cmd, inbuf)
>>>>>>>>>>>>>   File "/usr/share/ceph/mgr/mgr_module.py", line 320, in call
>>>>>>>>>>>>>     return self.func(mgr, **kwargs)
>>>>>>>>>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line
>>>>>>>>>>>>> 102, in
>>>>>>>>>>>>> <lambda>
>>>>>>>>>>>>>     wrapper_copy = lambda *l_args, **l_kwargs:
>>>>>>>>>>>>> wrapper(*l_args, **l_kwargs)
>>>>>>>>>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line
>>>>>>>>>>>>> 91, in wrapper
>>>>>>>>>>>>>     return func(*args, **kwargs)
>>>>>>>>>>>>>   File "/usr/share/ceph/mgr/orchestrator/module.py", line 503,
>>>>>>>>>>>>> in
>>>>>>>>>>>>> _list_services
>>>>>>>>>>>>>     raise_if_exception(completion)
>>>>>>>>>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line
>>>>>>>>>>>>> 642, in
>>>>>>>>>>>>> raise_if_exception
>>>>>>>>>>>>>     raise e
>>>>>>>>>>>>> AssertionError: cephadm
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Sep 2, 2022 at 1:32 AM Satish Patel <
>>>>>>>>>>>>> satish.txt@xxxxxxxxx> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> > nevermind, i found doc related that and i am able to get 1
>>>>>>>>>>>>> mgr up -
>>>>>>>>>>>>> >
>>>>>>>>>>>>> https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#manually-deploying-a-mgr-daemon
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > On Fri, Sep 2, 2022 at 1:21 AM Satish Patel <
>>>>>>>>>>>>> satish.txt@xxxxxxxxx> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >> Folks,
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> I am having little fun time with cephadm and it's very
>>>>>>>>>>>>> annoying to deal
>>>>>>>>>>>>> >> with it
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> I have deployed a ceph cluster using cephadm on two nodes.
>>>>>>>>>>>>> Now when i was
>>>>>>>>>>>>> >> trying to upgrade and noticed hiccups where it just
>>>>>>>>>>>>> upgraded a single mgr
>>>>>>>>>>>>> >> with 16.2.10 but not other so i started messing around and
>>>>>>>>>>>>> somehow I
>>>>>>>>>>>>> >> deleted both mgr in the thought that cephadm will recreate
>>>>>>>>>>>>> them.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Now i don't have any single mgr so my ceph orch command
>>>>>>>>>>>>> hangs forever and
>>>>>>>>>>>>> >> looks like a chicken egg issue.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> How do I recover from this? If I can't run the ceph orch
>>>>>>>>>>>>> command, I won't
>>>>>>>>>>>>> >> be able to redeploy my mgr daemons.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> I am not able to find any mgr in the following command on
>>>>>>>>>>>>> both nodes.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> $ cephadm ls | grep mgr
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>>>>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>>>>>>>>>>
>>>>>>>>>>>>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux