Hello all,
Sorry if this question has been answered before, but I didn't find anything
in the archives.
We deployed a Red Hat Cluster Suite on a customer, and apparently everything
goes fine until there's a need for one node to fence the other (for
instance, we turn it off to test failover).
As usual for us, we configured the fencing using IPMI, which is available on
every modern branded server.
It seems that sometimes, one machine can't fence the other. Although we can
see the Cluster starting "ipmitool -I lanplus -H xxx -U xxx -P xxx chassis
power off", it times out while trying to power off the other machine.
The more incredible thing is that if, at this exact moment, we issue an
"ipmitool ... chassis power status" at the command line, it works ok with
the same node failing.
So I have a few questions:
* can a problem like this (fencing agent not being able to fence) cause
instability on the cluster? In our case, the clusters gets crazy even if we
reboot the failed node, it does join the cluster, but rgmanager never gets
started;
* has anyone faced this problem with IPMI? We have used IPMI as a fence
agent on tenths of implementations with Red Hat Cluster Suite, since version
3, and we have never had this kind of problem. The servers in question are
Dell PowerEdges 2900, and there is a crossover cable beetween both onboard
#1 NICs of the server, so that we have a dedicated network path for one
machine turning off the other.
Thank you all for your support.
Regards,
Celso.
--
Esta mensagem foi verificada pelo sistema de antivírus e
acredita-se estar livre de perigo.
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster