Re: Node is randomly fenced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/06/14 12:33 PM, yvette hirth wrote:
> On 06/12/2014 08:32 AM, Schaefer, Micah wrote:
> 
>> Yesterday I added bonds on nodes 3 and 4. Today, node4 was active and
>> fenced, then node3 was fenced when node4 came back online. The network
>> topology is as follows:
>> switch1: node1, node3 (two connections)
>> switch2: node2, node4 (two connections)
>> switch1 <―> switch2
>> All on the same subnet
>>
>> I set up monitoring at 100 millisecond of the nics in active-backup mode,
>> and saw no messages about link problems before the fence.
>>
>> I see multicast between the servers using tcpdump.
>>
>> Any more ideas?
> 
> spanning-tree scans/rebuilds happen on 10Gb circuits just like they do
> on 1Gb circuits, and when they happen, traffic on the switches *can*
> come to a grinding halt, depending upon the switch firmware and the type
> of spanning-tree scan/rebuild being done.
> 
> you may want to check your switch logs to see if any spanning-tree
> rebuilds were being done at the time of the fence.
> 
> just an idea, and hth
> yvette hirth

When I've seen this (I now disable STP entirely), it blocks all traffic
so I would expect multiple/all nodes to partition off on their own.
Still, worth looking into. :)

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster





[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux