On Fri, Feb 02, 2007 at 11:31:21AM +0100, Miroslav Zubcic wrote: > David Teigland wrote: > > > I think you want something like this instead: > > > > <fence> > > <method name="1"> > > <device name="pwr01" option="off" port="1" ../> > > <device name="pwr02" option="off" port="1" ../> > > <device name="pwr01" option="on" port="1" ../> > > <device name="pwr02" option="on" port="1" ../> > > </method> > > </fence> > > Yes! It works. I just tried two devices in one method, and then > triggered fencing with "ip link set dev bond0 down" on one node ... > three times just in case. It works. > > > There are two problems with your config: > > > 1. You have both devices in separate methods. A second method is only > > tried if the first fails. > > I didn't know there is a option to define both devices in one method. > This system-config-cluster tool which I'm usually using to create > initial configuration/skeleton for a new cluster setup doesn't have such > option, man page cluster.conf(5) and PDF documentation fails to describe > all possible config parameters, I concluded ad hoc, that declaring two > devices in one method will be error. Well, ok, now I see that it isn't. > > > 2. You're using the default "reboot" option which isn't reliable with dual > > power supplies. The first port may come back on before the second is > > turned off. So, you need to turn both ports off (ensuring the power is > > really off) before turning either back on. > > I have configured outlet on APC switches like this: > > 1- Outlet Name : Outlet 1 > 2- Power On Delay(sec) : 4 > 3- Power Off Delay(sec): 1 > 4- Reboot Duration(sec): 6 > 5- Accept Changes : > > So I didn't used undocumented options off/on in cluster.conf(5), 6 > second is enough for two telnet actions from fence_apc(8) I think. > > It would be really nice if man(1) pages are up2date eh? > > > You may still have a minor problem, though, because in the two-node > > cluster mode, a cluster partition will result in both nodes trying to > > fence each other in parallel. > > I have discussed this issue on this list earlier. In RHEL 3 there was > "tiebreaker IP address" option which dissapeared in RHEL 4 cluster, so > we have well known "split brain" cluster condition. > > It wolud be really nice if fenced(8) checks ethernet link condition > before deciding to fence partner in two-node cluster. Somehow, two-node > clusters are very popular in my country. Have you looked into qdisk yet? It's new and might help in this area. > > With a single power supply this works fine > > because one node will always be turned off before it can turn off the > > other. But, with dual power supplies you can get both nodes turning off > > one power port on the other, although only one of the nodes should succeed > > in turning off the second power port. i.e. the winner of the fencing race > > may end up with one of its power ports turned off. Whether this is a big > > problem, I don't know. > > Yes, this is fourth time - fourth cluster installation, and I always > have this problem. Weather machines have single or dual power supply. > > I have workaround for this: > > I creat bonding interface with all physical ethernet ports in it. > Then, I configure vlanX interface with bond0 as base. On Cisco ethernet > switch, I configure main ethernet segment untagged, and vlanX ethernet > segment as tagged. On main ethernet there is data network, VIP addresses > etc, but connection with fence devices (APC, WTI, iLO, RSA II ...) are > in encapsulated vlan interface. In that way, while the last physical > ethernet is functional and working, node is not fenced. If the last > ethernet in bonding aggregation fails, node is fenced, but it doesn't > have a chance to fence other node, because L2 layer + network is on the > same physical devices where main link is. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster