On 06/21/2010 08:52 AM, Kaloyan Kovachev wrote:
On Fri, 18 Jun 2010 18:15:09 +0200, brem belguebli
<brem.belguebli@xxxxxxxxx> wrote:
How do you deal with fencing when the intersite interconnects (SAN and
LAN) are the cause of the failure ?
GPRS or the good old modem over a phone line?
That isn't going to work if the whole site is down for whatever reason
(unlikely as it may be).
To protect yourself from the 100% outage of a remote site, the only sane
way I of approaching it I can think of is to do something like the
following:
1) Make each node fence itself off from the failed node using iptables
or some other firewalling method. The SAN should also be prevented from
allowing the booted out node back onto it.
2) Fail over the IP address or DNS name of the service. Since it's
across different sites, you are likely to have to use something like RIP
to re-route the IPs, so DNS on short refresh may well be an easier and
possibly safer option. It'll mean some downtime, but probably less than
any manual intervention in an unplanned case.
It's not entirely ideal, bit it's about as good as it is likely to get.
And you can write a fencing agent to do something like this easily enough.
Gordan
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster