Re: How to troubleshoot monitor node

Janne Johansson <icepic.dz@xxxxxxxxx> · Mon, 10 Jan 2022 16:13:10 +0100

modern clusters use msgr2 communications on port 3300 by default I think.
Also, check on the 192.168.14.48 host with "netstat -an | grep LIST"
or "ss -ntlp" if something is listening on 6789 and/or 3300.

Den mån 10 jan. 2022 kl 16:10 skrev Andreas Feile <atann@xxxxxxxxxxxx>:
>
> Hi all,
>
> I've set up a 6-node ceph cluster to learn how ceph works and what I can
> do with it. However, I'm new to ceph, so if the answer to one of my
> questions is RTFM, point me to the right place.
>
> My problem is this:
> The cluster consists of 3 mons and 3 osds. Even though the dashboard
> shows all green, the mon01 has a problem: the ceph command hangs and
> never comes back:
>
>
> root@mon01:~# ceph --version
> ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus
> (stable)
>
> root@mon01:~# ceph -s
> ^CCluster connection aborted
>
>
> To see what happens I tried this:
>
> root@mon01:~# ceph -s --debug-ms=1
> 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 Processor -- start
> 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- start start
> 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 --2- >>
> [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
> 0x7f4a28066e40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).connect
> 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- -->
> [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] -- mon_getmap magic: 0
> v1 -- 0x7f4a28067330 con 0x7f4a28066a30
> 2022-01-10T15:51:30.434+0100 7f4a2659c700 1 -- >>
> [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
> msgr2=0x7f4a28066e40 unknown :-1 s=STATE_CONNECTING_RE l=0).process
> reconnect failed to v2:192.168.14.48:3300/0
> ...
>
>
> Indeed, both ports are closed:
>
> root@mon01:~# nc -z 192.168.14.48 6789; echo $?
> 1
> root@mon01:~# nc -z 192.168.14.48 3300; echo $?
> 1
>
> In /var/log/ceph/cephadm.log, I cannot see any useful infos about what
> might go wrong.
>
> I'm not aware of anything I could have done to trigger this error, and I
> wonder what I could do next to repair this monitor node.
>
> Any hint is appreciated.
>
> --
> Andre Tann
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx