Re: Ceph Pacific mon is not starting after host reboot

Adrian Nicolae <adrian.nicolae@xxxxxxxxxx> · Sun, 23 May 2021 20:18:55 +0300

I think that the orchestrator is trying to bring it up but it's not 
starting (see the errors from my previous e-mail) - the container is not 
starting even if I tried to start it manually.

the placement is the default one , ceph started the mons automatically 
on all my hosts because I only have 3 and the default mon number is 5.

root@node01:/home/adrian# ceph orch ls
NAME                       PORTS   RUNNING  REFRESHED  AGE PLACEMENT
alertmanager                           1/1  16m ago    5h count:1
crash                                  3/3  16m ago    5h   *
grafana                                1/1  16m ago    5h count:1
mgr                                    2/2  16m ago    5h count:2
mon                                    3/5  16m ago    10h count:5
node-exporter                          3/3  16m ago    5h   *
osd.all-available-devices            12/15  16m ago    4h   *
prometheus                             1/1  16m ago    5h count:1
rgw.digi1                  ?:8000      3/3  16m ago    3h 
node01;node02;node03;count:3

I've added the hosts using only the hostnames :

root@node01:/home/adrian# ceph orch host ls
HOST    ADDR          LABELS  STATUS
node01  192.168.80.2
node02  node02
node03  node03

On 5/23/2021 7:52 PM, 胡 玮文 wrote:
So the orchestrator is aware of that mon is stopped, but not tried to bring it up again. What is the placement of mon shown in “ceph orch ls”? I explicitly set it to all host names (e.g. node01;node02;node03), and haven’t experienced this.

在 2021年5月24日，00:35，Adrian Nicolae <adrian.nicolae@xxxxxxxxxx> 写道：

Hi,

I waited for more than a day on the first mon failure, it didn't resolve automatically.

I checked with 'ceph status'  and also the ceph.conf on that hosts and the failed mon was removed from the monmap.  The cluster reported only 2 mons (instead of 3) and the third mon was completely removed from config , it wasn't reported as a failure on 'ceph status'.

On 5/23/2021 7:30 PM, 胡 玮文 wrote:
Hi Adrian,

I have not tried, but I think it will resolve itself automatically after some minutes. How long have you waited before you do the manual redeploy?

Could you also try “ceph mon dump” to see whether mon.node03 is actually removed from monmap when it failed to start?

在 2021年5月23日，16:40，Adrian Nicolae <adrian.nicolae@xxxxxxxxxx> 写道：
Hi guys,

I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put it in production on a 1PB+ storage cluster with rgw-only access.

I noticed a weird issue with my mons :

- if I reboot a mon host, the ceph-mon container is not starting after reboot

- I can see with 'ceph orch ps' the following output :

mon.node01               node01               running (20h)   4m ago     20h   16.2.4     8d91d370c2b8  0a2e86af94b2
mon.node02               node02               running (115m)  12s ago    115m  16.2.4     8d91d370c2b8  51f4885a1b06
mon.node03               node03               stopped         4m ago     19h   <unknown>  <unknown>     <unknown>

(where node03 is the host which was rebooted).

- I tried to start the mon container manually on node03 with '/bin/bash /var/lib/ceph/c2d41ac4-baf5-11eb-865d-2dc838a337a3/mon.node03/unit.run' and I've got the following output :

debug 2021-05-23T08:24:25.192+0000 7f9a9e358700  0 mon.node03@-1(???).osd e408 crush map has features 3314933069573799936, adjusting msgr requires
debug 2021-05-23T08:24:25.192+0000 7f9a9e358700  0 mon.node03@-1(???).osd e408 crush map has features 432629308056666112, adjusting msgr requires
debug 2021-05-23T08:24:25.192+0000 7f9a9e358700  0 mon.node03@-1(???).osd e408 crush map has features 432629308056666112, adjusting msgr requires
debug 2021-05-23T08:24:25.192+0000 7f9a9e358700  0 mon.node03@-1(???).osd e408 crush map has features 432629308056666112, adjusting msgr requires
cluster 2021-05-23T08:07:12.189243+0000 mgr.node01.ksitls (mgr.14164) 36380 : cluster [DBG] pgmap v36392: 417 pgs: 417 active+clean; 33 KiB data, 605 MiB used, 651 GiB / 652 GiB avail; 9.6 KiB/s rd, 0 B/s wr, 15 op/s
debug 2021-05-23T08:24:25.196+0000 7f9a9e358700  1 mon.node03@-1(???).paxosservice(auth 1..51) refresh upgraded, format 0 -> 3
debug 2021-05-23T08:24:25.208+0000 7f9a88176700  1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f9a88176700' had timed out after 0.000000000s
debug 2021-05-23T08:24:25.208+0000 7f9a9e358700  0 mon.node03@-1(probing) e5  my rank is now 1 (was -1)
debug 2021-05-23T08:24:25.212+0000 7f9a87975700  0 mon.node03@1(probing) e6  removed from monmap, suicide.

root@node03:/home/adrian# systemctl status ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3@mon.node03.service
● ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3@mon.node03.service - Ceph mon.node03 for c2d41ac4-baf5-11eb-865d-2dc838a337a3
      Loaded: loaded (/etc/systemd/system/ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3@.service; enabled; vendor preset: enabled)
      Active: inactive (dead) since Sun 2021-05-23 08:10:00 UTC; 16min ago
     Process: 1176 ExecStart=/bin/bash /var/lib/ceph/c2d41ac4-baf5-11eb-865d-2dc838a337a3/mon.node03/unit.run (code=exited, status=0/SUCCESS)
     Process: 1855 ExecStop=/usr/bin/docker stop ceph-c2d41ac4-baf5-11eb-865d-2dc838a337a3-mon.node03 (code=exited, status=1/FAILURE)
     Process: 1861 ExecStopPost=/bin/bash /var/lib/ceph/c2d41ac4-baf5-11eb-865d-2dc838a337a3/mon.node03/unit.poststop (code=exited, status=0/SUCCESS)
    Main PID: 1176 (code=exited, status=0/SUCCESS)

The only fix I could find was to redeploy the mon with :

ceph orch daemon rm  mon.node03 --force
ceph orch daemon add mon node03

However, even if it's working after redeploy, it's not giving me a lot of trust to use it in a production environment having an issue like that.  I could reproduce it with 2 different mons so it's not just an exception.

My setup is based on Ubuntu 20.04 and docker instead of podman :

root@node01:~# docker -v
Docker version 20.10.6, build 370c289

Do you know a workaround for this issue or is this a known bug ? I noticed that there are some other complaints with the same behaviour in Octopus as well and the solution at that time was to delete the /var/lib/ceph/mon folder .

Thanks.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx