Re: Node is randomly fenced

Digimer <lists@xxxxxxxxxx> · Wed, 04 Jun 2014 11:13:15 -0400

On 04/06/14 10:59 AM, Schaefer, Micah wrote:
I have a 4 node cluster, running a single service group. I have been
seeing node1 fence node3 while node3 is actively running the service group
at random intervals.

Rgmanager logs show no failures in service checks, and no other logs
provide any useful information. How can I go about finding out why node1
is fencing node3?

I currently set up the failover domain to be restricted and not include
node3.

cluster.conf : http://pastebin.com/xYy6xp6N

Random fencing is almost always caused by network failures. Can you look 
are the system logs, starting a little before the fence and continuing 
until after the fence completes, and paste them here? I suspect you will 
see corosync complaining.

If this is true, do your switches support persistent multicast? Do you 
use active/passive bonding? Have you tried different switch/cable/NIC?

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster