Re: How to troubleshoot monitor node

Boris Behrens <bb@xxxxxxxxx> · Mon, 10 Jan 2022 16:22:28 +0100

I would go with the ss tool, because netstat shortens IPv6 addresses, so
you don't see if it is actually listening on the correct address.

Am Mo., 10. Jan. 2022 um 16:14 Uhr schrieb Janne Johansson <
icepic.dz@xxxxxxxxx>:

> modern clusters use msgr2 communications on port 3300 by default I think.
> Also, check on the 192.168.14.48 host with "netstat -an | grep LIST"
> or "ss -ntlp" if something is listening on 6789 and/or 3300.
>
> Den mån 10 jan. 2022 kl 16:10 skrev Andreas Feile <atann@xxxxxxxxxxxx>:
> >
> > Hi all,
> >
> > I've set up a 6-node ceph cluster to learn how ceph works and what I can
> > do with it. However, I'm new to ceph, so if the answer to one of my
> > questions is RTFM, point me to the right place.
> >
> > My problem is this:
> > The cluster consists of 3 mons and 3 osds. Even though the dashboard
> > shows all green, the mon01 has a problem: the ceph command hangs and
> > never comes back:
> >
> >
> > root@mon01:~# ceph --version
> > ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus
> > (stable)
> >
> > root@mon01:~# ceph -s
> > ^CCluster connection aborted
> >
> >
> > To see what happens I tried this:
> >
> > root@mon01:~# ceph -s --debug-ms=1
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 Processor -- start
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- start start
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 --2- >>
> > [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
> > 0x7f4a28066e40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0
> tx=0).connect
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- -->
> > [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] -- mon_getmap magic: 0
> > v1 -- 0x7f4a28067330 con 0x7f4a28066a30
> > 2022-01-10T15:51:30.434+0100 7f4a2659c700 1 -- >>
> > [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
> > msgr2=0x7f4a28066e40 unknown :-1 s=STATE_CONNECTING_RE l=0).process
> > reconnect failed to v2:192.168.14.48:3300/0
> > ...
> >
> >
> > Indeed, both ports are closed:
> >
> > root@mon01:~# nc -z 192.168.14.48 6789; echo $?
> > 1
> > root@mon01:~# nc -z 192.168.14.48 3300; echo $?
> > 1
> >
> > In /var/log/ceph/cephadm.log, I cannot see any useful infos about what
> > might go wrong.
> >
> > I'm not aware of anything I could have done to trigger this error, and I
> > wonder what I could do next to repair this monitor node.
> >
> > Any hint is appreciated.
> >
> > --
> > Andre Tann
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> --
> May the most significant bit of your life be positive.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx