On 12/06/14 12:33 PM, yvette hirth wrote: > On 06/12/2014 08:32 AM, Schaefer, Micah wrote: > >> Yesterday I added bonds on nodes 3 and 4. Today, node4 was active and >> fenced, then node3 was fenced when node4 came back online. The network >> topology is as follows: >> switch1: node1, node3 (two connections) >> switch2: node2, node4 (two connections) >> switch1 <―> switch2 >> All on the same subnet >> >> I set up monitoring at 100 millisecond of the nics in active-backup mode, >> and saw no messages about link problems before the fence. >> >> I see multicast between the servers using tcpdump. >> >> Any more ideas? > > spanning-tree scans/rebuilds happen on 10Gb circuits just like they do > on 1Gb circuits, and when they happen, traffic on the switches *can* > come to a grinding halt, depending upon the switch firmware and the type > of spanning-tree scan/rebuild being done. > > you may want to check your switch logs to see if any spanning-tree > rebuilds were being done at the time of the fence. > > just an idea, and hth > yvette hirth When I've seen this (I now disable STP entirely), it blocks all traffic so I would expect multiple/all nodes to partition off on their own. Still, worth looking into. :) -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster