Can't join new mon - lossy channel, failing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I was make a mkfs for new mon, but mon stuck on probing. On debug I see: fault on lossy channel, failing. This is a bad (lossy) network (crc mismatch)?


2021-10-04 16:22:24.707 7f5952761700 10 mon.mon2@-1(probing) e10 probing other monitors
2021-10-04 16:22:24.707 7f5952761700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] send_to--> mon [v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0] -- mon_probe(probe 677f4be1-cd98-496d-8b50-1f99df0df670 name mon2 new mon_release 14) v7 -- ?+0 0x5602864cd480
2021-10-04 16:22:24.707 7f5952761700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] --> [v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0] -- mon_probe(probe 677f4be1-cd98-496d-8b50-1f99df0df670 name mon2 new mon_release 14) v7 -- 0x5602864cd480 con 0x560285455600
2021-10-04 16:22:24.707 7f5952761700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] send_to--> mon [v2:10.40.0.83:3300/0,v1:10.40.0.83:6789/0] -- mon_probe(probe 677f4be1-cd98-496d-8b50-1f99df0df670 name mon2 new mon_release 14) v7 -- ?+0 0x5602893ffc00
2021-10-04 16:22:24.707 7f5952761700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] --> [v2:10.40.0.83:3300/0,v1:10.40.0.83:6789/0] -- mon_probe(probe 677f4be1-cd98-496d-8b50-1f99df0df670 name mon2 new mon_release 14) v7 -- 0x5602893ffc00 con 0x560285455a80
2021-10-04 16:22:24.707 7f5952761700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] send_to--> mon [v2:10.40.0.86:3300/0,v1:10.40.0.86:6789/0] -- mon_probe(probe 677f4be1-cd98-496d-8b50-1f99df0df670 name mon2 new mon_release 14) v7 -- ?+0 0x560288e98a00
2021-10-04 16:22:24.707 7f5952761700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] --> [v2:10.40.0.86:3300/0,v1:10.40.0.86:6789/0] -- mon_probe(probe 677f4be1-cd98-496d-8b50-1f99df0df670 name mon2 new mon_release 14) v7 -- 0x560288e98a00 con 0x5602862d8000
2021-10-04 16:22:24.707 7f594ff5c700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] <== mon.1 v2:10.40.0.83:3300/0 581 ==== mon_probe(reply 677f4be1-cd98-496d-8b50-1f99df0df670 name ceph-03 quorum 0,1,2 paxos( fc 127723108 lc 127723840 ) mon_release 14) v7 ==== 504+0+0 (crc 0 0 0) 0x560287a94f00 con 0x560285455a80
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10 handle_probe mon_probe(reply 677f4be1-cd98-496d-8b50-1f99df0df670 name ceph-03 quorum 0,1,2 paxos( fc 127723108 lc 127723840 ) mon_release 14) v7
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10 handle_probe_reply mon.1 v2:10.40.0.83:3300/0 mon_probe(reply 677f4be1-cd98-496d-8b50-1f99df0df670 name ceph-03 quorum 0,1,2 paxos( fc 127723108 lc 127723840 ) mon_release 14) v7
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  monmap is e10: 3 mons at {ceph-01=[v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0],ceph-03=[v2:10.40.0.83:3300/0,v1:10.40.0.83:6789/0],ceph-06=[v2:10.40.0.86:3300/0,v1:10.40.0.86:6789/0]}
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  peer name is ceph-03
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  existing quorum 0,1,2
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  peer paxos version 127723840 vs my version 127723835 (ok)
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  ready to join, but i'm not in the monmap or my addr is blank, trying to join
2021-10-04 16:22:24.707 7f594ff5c700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] send_to--> mon [v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0] -- mon_join(mon2 [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0]) v2 -- ?+0 0x5602864001c0
2021-10-04 16:22:24.707 7f594ff5c700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] --> [v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0] -- mon_join(mon2 [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0]) v2 -- 0x5602864001c0 con 0x560285455600
2021-10-04 16:22:24.707 7f594ff5c700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] <== mon.2 v2:10.40.0.86:3300/0 574 ==== mon_probe(reply 677f4be1-cd98-496d-8b50-1f99df0df670 name ceph-06 quorum 0,1,2 paxos( fc 127723108 lc 127723840 ) mon_release 14) v7 ==== 504+0+0 (crc 0 0 0) 0x56028aa25480 con 0x5602862d8000
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10 handle_probe mon_probe(reply 677f4be1-cd98-496d-8b50-1f99df0df670 name ceph-06 quorum 0,1,2 paxos( fc 127723108 lc 127723840 ) mon_release 14) v7
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10 handle_probe_reply mon.2 v2:10.40.0.86:3300/0 mon_probe(reply 677f4be1-cd98-496d-8b50-1f99df0df670 name ceph-06 quorum 0,1,2 paxos( fc 127723108 lc 127723840 ) mon_release 14) v7
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  monmap is e10: 3 mons at {ceph-01=[v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0],ceph-03=[v2:10.40.0.83:3300/0,v1:10.40.0.83:6789/0],ceph-06=[v2:10.40.0.86:3300/0,v1:10.40.0.86:6789/0]}
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  peer name is ceph-06
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  existing quorum 0,1,2
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  peer paxos version 127723840 vs my version 127723835 (ok)
2021-10-04 16:22:24.707 7f594ff5c700 10 mon.mon2@-1(probing) e10  ready to join, but i'm not in the monmap or my addr is blank, trying to join
2021-10-04 16:22:24.707 7f594ff5c700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] send_to--> mon [v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0] -- mon_join(mon2 [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0]) v2 -- ?+0 0x560286400400
2021-10-04 16:22:24.707 7f594ff5c700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] --> [v2:10.40.0.81:3300/0,v1:10.40.0.81:6789/0] -- mon_join(mon2 [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0]) v2 -- 0x560286400400 con 0x560285455600
2021-10-04 16:22:24.779 7f594cf56700  1 --2- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 0x560287e56000 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).connect
2021-10-04 16:22:24.779 7f594cf56700  1 -- 10.40.0.82:0/9719 --> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] -- mgropen(unknown.mon2) v3 -- 0x56028541d900 con 0x560287e40000
2021-10-04 16:22:24.779 7f5953f64700  1 --2- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 0x560287e56000 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
2021-10-04 16:22:24.779 7f5953f64700 10 mon.mon2@-1(probing) e10 ms_get_authorizer for mgr
2021-10-04 16:22:24.779 7f5953f64700  1 --2- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 0x560287e56000 crc :-1 s=READY pgs=16872 cs=0 l=1 rev1=1 rx=0 tx=0).ready entity=mgr.62450337 client_cookie=5a76b276e3a3deca server_cookie=0 in_seq=0 out_seq=0
2021-10-04 16:22:24.779 7f5953f64700  1 -- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 msgr2=0x560287e56000 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_bulk peer close file descriptor 28
2021-10-04 16:22:24.779 7f5953f64700  1 -- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 msgr2=0x560287e56000 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until read failed
2021-10-04 16:22:24.779 7f594c755700  1 -- 10.40.0.82:0/9719 <== mgr.62450337 v2:10.40.0.81:6898/2507925 1 ==== mgrconfigure(period=5, threshold=5) v3 ==== 12+0+0 (crc 0 0 0) 0x560287dd3a20 con 0x560287e40000
2021-10-04 16:22:24.779 7f5953f64700  1 --2- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 0x560287e56000 crc :-1 s=READY pgs=16872 cs=0 l=1 rev1=1 rx=0 tx=0).handle_read_frame_preamble_main read frame preamble failed r=-1 ((1) Operation not permitted)
2021-10-04 16:22:24.779 7f5953f64700  1 --2- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 0x560287e56000 crc :-1 s=READY pgs=16872 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-10-04 16:22:24.779 7f594c755700  1 -- 10.40.0.82:0/9719 --> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] -- mgrreport(unknown.mon2 +100-0 packed 1174 task_status=0) v8 -- 0x5602860a9880 con 0x560287e40000
2021-10-04 16:22:24.779 7f594c755700  1 -- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 msgr2=0x560287e56000 unknown :-1 s=STATE_CLOSED l=1).mark_down
2021-10-04 16:22:24.779 7f594c755700  1 --2- 10.40.0.82:0/9719 >> [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e40000 0x560287e56000 unknown :-1 s=CLOSED pgs=16872 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-10-04 16:22:24.839 7f594ef5a700  1 --2- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] >>  conn(0x560287aac880 0x5602875c4a00 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).accept
2021-10-04 16:22:24.839 7f5953f64700  1 --2- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] >>  conn(0x560287aac880 0x5602875c4a00 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
2021-10-04 16:22:24.839 7f5953f64700  1 -- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] >>  conn(0x560287aac880 msgr2=0x5602875c4a00 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send send error: (32) Broken pipe
2021-10-04 16:22:24.839 7f5953f64700  1 --2- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] >>  conn(0x560287aac880 0x5602875c4a00 unknown :-1 s=HELLO_ACCEPTING pgs=0 cs=0 l=0 rev1=1 rx=0 tx=0).write hello frame write failed r=-32 ((32) Broken pipe)
2021-10-04 16:22:24.839 7f5953f64700  1 --2- [v2:10.40.0.82:3300/0,v1:10.40.0.82:6789/0] >>  conn(0x560287aac880 0x5602875c4a00 unknown :-1 s=HELLO_ACCEPTING pgs=0 cs=0 l=0 rev1=1 rx=0 tx=0).stop
2021-10-04 16:22:24.839 7f594ff5c700 10 mon.mon2@-1(probing) e10 ms_handle_reset 0x560287aac880



Thanks,
k
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux