Re: How to troubleshoot monitor node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I would go with the ss tool, because netstat shortens IPv6 addresses, so
you don't see if it is actually listening on the correct address.

Am Mo., 10. Jan. 2022 um 16:14 Uhr schrieb Janne Johansson <
icepic.dz@xxxxxxxxx>:

> modern clusters use msgr2 communications on port 3300 by default I think.
> Also, check on the 192.168.14.48 host with "netstat -an | grep LIST"
> or "ss -ntlp" if something is listening on 6789 and/or 3300.
>
> Den mån 10 jan. 2022 kl 16:10 skrev Andreas Feile <atann@xxxxxxxxxxxx>:
> >
> > Hi all,
> >
> > I've set up a 6-node ceph cluster to learn how ceph works and what I can
> > do with it. However, I'm new to ceph, so if the answer to one of my
> > questions is RTFM, point me to the right place.
> >
> > My problem is this:
> > The cluster consists of 3 mons and 3 osds. Even though the dashboard
> > shows all green, the mon01 has a problem: the ceph command hangs and
> > never comes back:
> >
> >
> > root@mon01:~# ceph --version
> > ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus
> > (stable)
> >
> > root@mon01:~# ceph -s
> > ^CCluster connection aborted
> >
> >
> > To see what happens I tried this:
> >
> > root@mon01:~# ceph -s --debug-ms=1
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 Processor -- start
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- start start
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 --2- >>
> > [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
> > 0x7f4a28066e40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0
> tx=0).connect
> > 2022-01-10T15:51:30.434+0100 7f4a2cd7e700 1 -- -->
> > [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] -- mon_getmap magic: 0
> > v1 -- 0x7f4a28067330 con 0x7f4a28066a30
> > 2022-01-10T15:51:30.434+0100 7f4a2659c700 1 -- >>
> > [v2:192.168.14.48:3300/0,v1:192.168.14.48:6789/0] conn(0x7f4a28066a30
> > msgr2=0x7f4a28066e40 unknown :-1 s=STATE_CONNECTING_RE l=0).process
> > reconnect failed to v2:192.168.14.48:3300/0
> > ...
> >
> >
> > Indeed, both ports are closed:
> >
> > root@mon01:~# nc -z 192.168.14.48 6789; echo $?
> > 1
> > root@mon01:~# nc -z 192.168.14.48 3300; echo $?
> > 1
> >
> > In /var/log/ceph/cephadm.log, I cannot see any useful infos about what
> > might go wrong.
> >
> > I'm not aware of anything I could have done to trigger this error, and I
> > wonder what I could do next to repair this monitor node.
> >
> > Any hint is appreciated.
> >
> > --
> > Andre Tann
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> --
> May the most significant bit of your life be positive.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux