luminous: osd continue down because of the hearbeattimeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



HI! all! Thanks for reading this msg.


I hava one ceph cluster installed with ceph V12.2.12. It runs well for about half a year.


Last week we add anoher two meachine to this ceph cluster.Then all the osds became unstable.


The osd ansync message complain can not hearbeat to eachother.But the network ping with no drop packages and no error packages.


I use bond0 for the ceph cluster front and back netwrok.Now I set nodown noout  the cluster became stable,


but from the log I see a lot for error aysnc message.I have try simple message, It also the smae error.


All the osd error like below:


NG_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.989469 7f42794da700  0 -- 10.255.255.54:6814/1000006 >> 10.255.255.56:0/7 conn(0x55721e0e5800 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.989557 7f42784d8700  0 -- 10.255.255.54:6819/1000006 >> 10.255.255.52:0/7 conn(0x55721e0e8800 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.989728 7f4278cd9700  0 -- 10.255.255.54:6814/1000006 >> 10.255.255.55:0/7 conn(0x55722973b000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.989872 7f42794da700  0 -- 10.255.255.54:6819/1000006 >> 10.255.255.55:0/7 conn(0x557225b15000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.990111 7f42794da700  0 -- 10.255.255.54:6819/1000006 >> 10.255.255.55:0/7 conn(0x557228506000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.990161 7f42784d8700  0 -- 10.255.255.54:6819/1000006 >> 10.255.255.56:0/7 conn(0x557226320000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.990196 7f42794da700  0 -- 10.255.255.54:6814/1000006 >> 10.255.255.56:0/7 conn(0x55722650b000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.991450 7f4278cd9700  0 -- 10.255.255.54:6819/1000006 >> 10.255.255.55:0/7 conn(0x5572298d7800 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.991458 7f42784d8700  0 -- 10.255.255.54:6814/1000006 >> 10.255.255.52:0/7 conn(0x557226f19000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.991639 7f4278cd9700  0 -- 10.255.255.54:6819/1000006 >> 10.255.255.52:0/7 conn(0x557226867800 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.991798 7f42794da700  0 -- 10.255.255.54:6814/1000006 >> 10.255.255.56:0/7 conn(0x55722a20b000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)
2020-04-02 09:59:17.991842 7f42784d8700  0 -- 10.255.255.54:6819/1000006 >> 10.255.255.56:0/7 conn(0x557226869000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1)


The network config:
bond0     Link encap:Ethernet  HWaddr 6c:92:bf:c2:8e:e5  
          inet6 addr: fe80::6e92:bfff:fec2:8ee5/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:126155520073 errors:0 dropped:3217298 overruns:0 frame:0
          TX packets:133297822313 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:57485747361080 (57.4 TB)  TX bytes:71267041966300 (71.2 TB)


bond0.38  Link encap:Ethernet  HWaddr 6c:92:bf:c2:8e:e5  
          inet addr:192.168.38.54  Bcast:192.168.38.255  Mask:255.255.255.0
          inet6 addr: fe80::6e92:bfff:fec2:8ee5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:60802363 errors:0 dropped:0 overruns:0 frame:0
          TX packets:53614452 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:34857574617 (34.8 GB)  TX bytes:23829455266 (23.8 GB)


bond0.4000 Link encap:Ethernet  HWaddr 6c:92:bf:c2:8e:e5  
          inet addr:10.255.255.54  Bcast:10.255.255.63  Mask:255.255.255.192
          inet6 addr: fe80::6e92:bfff:fec2:8ee5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:107628762285 errors:0 dropped:0 overruns:0 frame:0
          TX packets:96091921746 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:54705054842566 (54.7 TB)  TX bytes:68763270985565 (68.7 TB)


brq86d8e0ef-fa Link encap:Ethernet  HWaddr 26:30:9e:96:7a:71  
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:2512246 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:338012546 (338.0 MB)  TX bytes:0 (0.0 B)


docker0   Link encap:Ethernet  HWaddr 02:42:67:87:8d:a7  
          inet addr:172.17.0.1  Bcast:172.17.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)


eno1      Link encap:Ethernet  HWaddr 6c:92:bf:c2:8e:e5  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:62792632732 errors:0 dropped:614221 overruns:0 frame:0
          TX packets:66647497482 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:28698305745018 (28.6 TB)  TX bytes:35631476375125 (35.6 TB)


eno2      Link encap:Ethernet  HWaddr 6c:92:bf:c2:8e:e5  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:63362887347 errors:0 dropped:633400 overruns:0 frame:0
          TX packets:66650324833 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:28787441616499 (28.7 TB)  TX bytes:35635565591656 (35.6 TB)


lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:3272039707 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3272039707 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:753581347467 (753.5 GB)  TX bytes:753581347467 (753.5 GB)


tap579ce88c-9e Link encap:Ethernet  HWaddr fe:16:3e:32:3b:0d  
          inet6 addr: fe80::fc16:3eff:fe32:3b0d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:3322983 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3480283 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:2221592585 (2.2 GB)  TX bytes:1380408263 (1.3 GB)


tapa32c35b1-87 Link encap:Ethernet  HWaddr fe:16:3e:79:65:9f  
          inet6 addr: fe80::fc16:3eff:fe79:659f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:22360161 errors:0 dropped:0 overruns:0 frame:0
          TX packets:25585406 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:3953751232 (3.9 GB)  TX bytes:6985195478 (6.9 GB)


vxlan-100 Link encap:Ethernet  HWaddr 26:30:9e:96:7a:71  
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:7671113091 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6732694121 errors:0 dropped:17713 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:2959527236402 (2.9 TB)  TX bytes:973862583509 (973.8 GB)
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux