Pete Wright wrote:
Hi All,
I've been noticing an issue on a couple boxen I have running
Centos4.1, here is the uname -a:
Linux xxx 2.6.9-11.ELsmp #1 SMP Wed Jun 8 16:59:12 CDT 2005 x86_64
x86_64 x86_64 GNU/Linux
I am defining the bonding kernel object as such in our modprobe.conf:
options bond0 mode=1 arp_interval=500 arp_ip_target=<gateway.ip>
we are bonding between two devices, and basic networking and failover is
working correctly. I can disable one link and packets pass through the
second interface as expected. The problem at hand is that it seems the
slave device is flapping. I have checked our switches that this device
connects to and do not see any errors there. The ports are not loosing
link. These are the messages in /var/log/messages:
<snip - sorry for wrapping>
May 7 04:02:46 critblade204 kernel: bonding: bond0: backup interface
eth1 is now down
May 7 04:02:46 critblade204 kernel: bonding: bond0: backup interface
eth1 is now down
May 7 04:02:47 critblade204 kernel: bonding: bond0: backup interface
eth1 is now up
May 7 04:02:47 critblade204 kernel: bonding: bond0: backup interface
eth1 is now up
May 7 04:02:48 critblade204 kernel: bonding: bond0: backup interface
eth1 is now down
May 7 04:02:48 critblade204 kernel: bonding: bond0: backup interface
eth1 is now down
</snip>
It seems that this was happening for several day's while the machine was
inactive (it is a development box not in production yet). I ssh'd into
the box today and it seems to have stopped flapping for the time being.
Any help would be appreciated, and if anyone needs more info or trouble
shooting data I'll be more than willing to help with that.
as per a suggestion by jarmo.jarvenpaa@xxxxxxxxxxx I forced link speed
via ethtool:
ETHTOOL_OPTS="speed 1000 duplex full autoneg off"
Unfortunatly this did not work. I am going to start looking at the code
for the bonding drivers now. Is it possible this could be a network
driver related issue (all NIC's are using the tg driver)? Up to this
point I've been assuming that this is due to a bug in the bonding.ko code.
Thanks,
Pete Wright
--
Peter Wright
Systems Administrator
Sony Pictures Imageworks
wright@xxxxxxxxxxxxxx
www.imageworks.com
-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html