Hi all,
I've set up a 6-node ceph cluster to learn how ceph works and what I can
do with it. However, I'm new to ceph, so if the answer to one of my
questions is RTFM, point me to the right place.
My problem is this:
The cluster consists of 3 mons and 3 osds. Even though the dashboard
shows all green, the mon01 has a problem: the ceph command hangs and
never comes back:
root@mon01:~# ceph --version
ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus
(stable)
root@mon01:~# ceph -s
^CCluster connection aborted
To see what happens I tried this:
root@mon01:~# ceph -s --debug-ms=1
2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 Processor -- start
2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- start start
2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 --2- >>
[v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
0x7f4a28066e40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).connect
2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- -->
[v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] -- mon_getmap magic: 0
v1 -- 0x7f4a28067330 con 0x7f4a28066a30
2022-01-10T15:51:30.434+0100 7f4a2659c700 1 -- >>
[v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
msgr2=0x7f4a28066e40 unknown :-1 s=STATE_CONNECTING_RE l=0).process
reconnect failed to v2:192.168.14.48:3300/0
...
Indeed, both ports are closed:
root@mon01:~# nc -z 192.168.14.48 6789; echo $?
1
root@mon01:~# nc -z 192.168.14.48 3300; echo $?
1
In /var/log/ceph/cephadm.log, I cannot see any useful infos about what
might go wrong.
I'm not aware of anything I could have done to trigger this error, and I
wonder what I could do next to repair this monitor node.
Any hint is appreciated.
--
Andre Tann
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx