On 06/21/2010 11:28 AM, Kaloyan Kovachev wrote:
On Mon, 21 Jun 2010 10:20:34 +0100, Gordan Bobic<gordan@xxxxxxxxxx>
wrote:
On 06/21/2010 08:52 AM, Kaloyan Kovachev wrote:
On Fri, 18 Jun 2010 18:15:09 +0200, brem belguebli
<brem.belguebli@xxxxxxxxx> wrote:
How do you deal with fencing when the intersite interconnects (SAN and
LAN) are the cause of the failure ?
GPRS or the good old modem over a phone line?
That isn't going to work if the whole site is down for whatever reason
(unlikely as it may be).
If the whole site is down because of a power failure - yes (well, then you
don't need to actually fence anything) , but if the failure is just in the
intersite connection - alternative low speed connection to simply fence the
remote nodes and tell the remote SAN to block it's access should be enough.
The problem is that although you don't need to fence anything, you need to:
1) Verify that the site is properly down
2) Make sure it stays down
Otherwise you are risking resource clashes.
To protect yourself from the 100% outage of a remote site, the only sane
way I of approaching it I can think of is to do something like the
following:
1) Make each node fence itself off from the failed node using iptables
or some other firewalling method. The SAN should also be prevented from
allowing the booted out node back onto it.
then each node should do that kind of fencing, but if a single node blocks
the port(s) on the switch (to the remote location) should be easier to do
as fencing agent. Again having additional communication channel will help -
"if it's just the link, then fence the remote nodes and don't block the
port(s)" this would avoid manual intervention to restore the link after the
outage is fixed
There is no reason why you couldn't fire off the iptables fencing
command to each node via SSH, so that whichever node does the fencing,
covers it for all nodes.
Gordan
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster