Thank you. I will try the export and import method first. Thank you, Anantha -----Original Message----- From: Eugen Block <eblock@xxxxxx> Sent: Monday, April 1, 2024 1:57 PM To: Adiga, Anantha <anantha.adiga@xxxxxxxxx> Cc: ceph-users@xxxxxxx Subject: Re: Re: ceph status not showing correct monitor services I have two approaches in mind, first one (and preferred) would be to edit the mon spec to first remove mon.a001s016 and have a clean state. Get the current spec with: ceph orch ls mon --export > mon-edit.yaml Edit the spec file so that mon.a001s016 is not part of it, then apply: ceph orch apply -i mon-edit.yaml This should remove the mon.a001s016 daemon. Then wait a few minutes or so (until the daemon is actually gone, check locally on the node with 'cephadm ls' and in /var/lib/ceph/<FSID>/removed) and add it back to the spec file, then apply again. I would expect a third MON to be deployed. If that doesn't work for some reason you'll need to inspect logs to find the root cause. The second approach would be to remove and add the daemon manually: ceph orch daemon rm mon.a001s016 Wait until it's really gone, then add it: ceph orch daemon add mon a001s016 Not entirely sure about the daemon add mon command, you might need to provide something else, I'm typing this by heart. Zitat von "Adiga, Anantha" <anantha.adiga@xxxxxxxxx>: > Hi Eugen, > > Yes that is it. OSDs were restarted since mon a001s017 was reporting > is low on available space. How to update the mon map to add > mon.a001s016 as it is already online? > And how to update mgr map to include standby mgr.a001s018 as it is > also running. > > > ceph mon dump > dumped monmap epoch 6 > epoch 6 > fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8 > last_changed 2024-03-31T23:54:18.692983+0000 created > 2021-09-30T16:15:12.884602+0000 min_mon_release 16 (pacific) > election_strategy: 1 > 0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018 > 1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017 > > > Thank you, > Anantha > > -----Original Message----- > From: Eugen Block <eblock@xxxxxx> > Sent: Monday, April 1, 2024 1:10 PM > To: ceph-users@xxxxxxx > Subject: Re: ceph status not showing correct monitor > services > > Maybe it’s just not in the monmap? Can you show the output of: > > ceph mon dump > > Did you do any maintenance (apparently OSDs restarted recently) and > maybe accidentally removed a MON from the monmap? > > > Zitat von "Adiga, Anantha" <anantha.adiga@xxxxxxxxx>: > >> Hi Anthony, >> >> Seeing it since last after noon. It is same with mgr services as , >> "ceph -s" is reporting only TWO instead of THREE >> >> Also mon and mgr shows " is_active: false" see below. >> >> # ceph orch ps --daemon_type=mgr >> NAME HOST PORTS STATUS REFRESHED AGE >> MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID >> mgr.a001s016.ctmoay a001s016 *:8443 running (18M) 3m ago 23M >> 206M - 16.2.5 6e73176320aa 169cafcbbb99 >> mgr.a001s017.bpygfm a001s017 *:8443 running (19M) 3m ago 23M >> 332M - 16.2.5 6e73176320aa 97257195158c >> mgr.a001s018.hcxnef a001s018 *:8443 running (20M) 3m ago 23M >> 113M - 16.2.5 6e73176320aa 21ba5896cee2 >> >> # ceph orch ls --service_name=mgr >> NAME PORTS RUNNING REFRESHED AGE PLACEMENT >> mgr 3/3 3m ago 23M a001s016;a001s017;a001s018;count:3 >> >> >> # ceph orch ps --daemon_type=mon --format=json-pretty >> >> [ >> { >> "container_id": "8484a912f96a", >> "container_image_digests": [ >> >> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586> >> ], >> "container_image_id": >> "6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f", >> "container_image_name": >> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>, >> "created": "2024-03-31T23:55:16.164155Z", >> "daemon_id": "a001s016", >> "daemon_type": "mon", >> "hostname": "a001s016", >> "is_active": false, >> <== why is it false >> "last_refresh": "2024-04-01T19:38:30.929014Z", >> "memory_request": 2147483648, >> "memory_usage": 761685606, >> "ports": [], >> "service_name": "mon", >> "started": "2024-03-31T23:55:16.268266Z", >> "status": 1, >> "status_desc": "running", >> "version": "16.2.5" >> }, >> >> >> Thank you, >> Anantha >> >> From: Anthony D'Atri <aad@xxxxxxxxxxxxxx> >> Sent: Monday, April 1, 2024 12:25 PM >> To: Adiga, Anantha <anantha.adiga@xxxxxxxxx> >> Cc: ceph-users@xxxxxxx >> Subject: Re: ceph status not showing correct monitor >> services >> >> >> >> >> a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay >> >> Looks like you just had an mgr failover? Could be that the secondary >> mgr hasn't caught up with current events. >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an >> email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx