Re: fence start-up issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Subhendu,

I remember to previously see a recommendation which is exactly the opposite in the Cluster FAQ, which was: do not use the same network for integrated fencing (iLO, DRAC, IPMI) and heartbeat.

Did this change recently or in Cluster Suite v5? I'm sure in v4 I had to make them in separate networks.

Thank you.

Celso.

Subhendu Ghosh escreveu:
Eric Ritchie wrote:
I sometimes run into an issue when a node in my 2-node cluster is rebooting and hangs on fenced. It seems it can't communicate with the other node and after the post_join_delay, it fences the other node. This happened again today, and when the second node rebooted after the fence, they were in a split-brain configuration. I saw in the cluster faq, in the cman section, question 6 that the cluster communication network should be the same network as the fencing device. I think this may be my problem but I don't understand why. I'm using HP iLo for fencing and I setup cross-connect cables for the cluster communication between the 2 nodes. Why would having cluster communication and fencing on different networks be an issue?

Thanks for your time


Having distinct heartbeat and fencing networks creates the possibility of race condition, which you seem to be running into.

The cluster communication may not have stabilized in the post_join_delay time frame due to any number of issues including network outage. In this case fencing would fail from the node starting up as it is the same path to fence device as to cluster member.

By separating the two - fence can succeed while cluster communication fails.

Recommendation would be for cluster communication and iLO reachability to be through the same NIC on the host.

-regards
Subhendu


--
*Celso Kopp Webber*

celso@xxxxxxxxxxxxxxxx <mailto:celso@xxxxxxxxxxxxxxxx>

*Webbertek - Opensource Knowledge*
(41) 8813-1919 - celular
(41) 4063-8448, ramal 102 - fixo


--
Esta mensagem foi verificada pelo sistema de antivírus e
acredita-se estar livre de perigo.

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux