corosync and network flow control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On a cluster with 10 nodes, corosync cpg_mcast_joined() starts to

simply returns CPG_ERR_TRY_AGAIN.

 

I can repeat cpg_mcast_joined()  (over a long time), but it never get back to normal behavior.

 

The node use bonding (eth0/eth1), and one network card failed several times:

 

# grep 'kernel: igb:' var/log/syslog*

var/log/syslog.2:Aug  6 21:26:32 pve01 kernel: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

var/log/syslog.2:Aug  6 21:27:00 pve01 kernel: igb: eth0 NIC Link is Down

var/log/syslog.2:Aug  6 21:27:02 pve01 kernel: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

var/log/syslog.2:Aug  6 21:28:02 pve01 kernel: igb: eth0 NIC Link is Down

var/log/syslog.2:Aug  6 21:28:08 pve01 kernel: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

var/log/syslog.2:Aug  6 21:29:11 pve01 kernel: igb: eth0 NIC Link is Down

 

But until now, no problem – corosync works as expected.

 

var/log/syslog.2:Aug  6 21:29:13 pve01 kernel: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

 

Please note that Flow control changed to: RX/TX

 

After that, corosync is completely unusable. Should I turn off flow control?

 

Is that a known problem?

 

- Dietmar

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux