I just found that the clock on node1 was off by about a minute and a half compared to the rest of the nodes. I am running ntp, so not sure why the time wasn’t synced up. Wonder if node1 being behind, would think it was not receiving updates from the other nodes? On 6/12/14, 1:29 PM, "Digimer" <lists@xxxxxxxxxx> wrote: >Even if the token changes stop the immediate fencing, don't leave it >please. There is something fundamentally wrong that you need to >identify/fix. > >Keep us posted! > >On 12/06/14 01:24 PM, Schaefer, Micah wrote: >> The servers do not run any tasks other than the tasks in the cluster >> service group. >> >> Nodes 3 and 4 are physical servers with a lot of horsepower and nodes 1 >> and 2 are virtual machines with much less resources available. >> >> I adjusted the token settings and will watch for any change. >> >> >> >> >> >> >> >> >> On 6/12/14, 1:08 PM, "Digimer" <lists@xxxxxxxxxx> wrote: >> >>> On 12/06/14 12:48 PM, Schaefer, Micah wrote: >>>> As far as the switch goes, both are Cisco Catalyst 6509-E, no spanning >>>> tree changes are happening and all the ports have port-fast enabled >>>>for >>>> these servers. My switch logging level is very high and I have no >>>> messages >>>> in relation to the time frames or ports. >>>> >>>> TOTEM reports that ³A processor joined or left the membershipŠ², but >>>> that >>>> isn¹t enough detail. >>>> >>>> Also note that I did not have these issues until adding new servers: >>>> node3 >>>> and node4 to the cluster. Node1 and node2 do not fence each other >>>> (unless >>>> a real issue is there), and they are on different switches. >>> >>> Then I can't imagine it being network anymore. Seeing as both node 3 >>>and >>> 4 get fenced, it's likely not hardware either. Are the workloads on 3 >>> and 4 much higher (or are the computers much slower) than 1 and 2? I'm >>> wondering if the nodes are simply not keeping up with corosync traffic. >>> You might try adjusting the corosync token timeout and retransmit >>>counts >>> to see if that reduces the node loses. >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.ca/w/ >>> What if the cure for cancer is trapped in the mind of a person without >>> access to education? >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster@xxxxxxxxxx >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> > > >-- >Digimer >Papers and Projects: https://alteeve.ca/w/ >What if the cure for cancer is trapped in the mind of a person without >access to education? > >-- >Linux-cluster mailing list >Linux-cluster@xxxxxxxxxx >https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster