Re: Re: More CS4 fencing fun

Lon Hohberger <lhh@xxxxxxxxxx> · Thu, 23 Mar 2006 12:52:10 -0500

Hi Matteo,

First off, you are correct.  Strictly from a "SPF protection / all other
failure scenarios are irrelevant" point of view, losing power -> fencing
failure is bad.

However, I hope I can convince you that this particular view is not the
right one to take in this case, but I doubt I will be able to.

On Wed, 2006-03-22 at 17:17 +0100, Matteo Catanese wrote:

> We are always talking about avoiding _single point of failure_, not  
> multiple ones.

We recover from several multi-point failures if there is a deterministic
way to do so.  Ex, sustaining 5 nodes failing in a 16-node cluster.

More so than NSPF, the cluster is designed to minimize uncertainty in
any failure case if possible - especially where data integrity is
concerned (i.e. fencing).

Given the above design goal, one can still very easily build NSPF
two-node clusters, but there are limitations on the hardware you can
use.  For example - 

* With iLO, you need redundant power supplies.

* With IPMI, you need redundant power supplies and an extra NIC.

* With single power supplies, you should use a remote power switch with
redundant power rails (where the internal electronics can run off of
either for full NSPF protection).  As of this writing, I am unaware of
any such thing available from any of the major IHVs.

* If redundant power supplies are not "redundant enough" in your
opinion, then you should probably use a redundant remote power switch as
noted above.

> So please at least for fence_ilo allow some parameter to let fence  
> spit out a warning and unlock the cluster service

Fencing, put simply, is a deterministic set of steps to take to
guarantee that a dead or misbehaving node can not (not "might not" or
"probably will not") access shared resources/partitions/storage.  It is
designed to have exactly two possible outcomes given a correctly
configured environment:

  - The node has been cut off from shared resources, or

  - Fencing the node has failed

If fencing fails, we retry forever.  Fencing failures are otherwise
unrecoverable.  The only way to recover from a particular fencing
failure is to provide a different fencing mechanism as a backup...

Ok, on how one could change the behavior...

>From a design perspective, if we were to change the behavior of fencing,
I would recommend changing it in fenced, not fence_ilo (e.g. give a
fenced a max_retries count or something), because once we do it for iLO,
we will have to do it for many other agents.  For example, most or all
of the supported APC switches only have a single (non-redundant) power
rail, so fence_apc would have to be changed too.

Here are some things you can do for your configuration:

(a) Add a human layer.  Add a manual fencing agent as a cascade to
detect this particular problem.  This is, in my opinion, the least
likely to solve your problem in the way you want, but if you consider a
power failure of a node fairly unlikely.

(b) Make fencing not fail.  Edit /sbin/fence_ilo and make it do what you
need.

(c) Roll your own fencing agent and add it as a cascade which will do
specifically what you want it to if iLO fencing fails.  For
example, /sbin/fence_dontcare.

#!/bin/bash
logger -p "daemon.emerg" "WARNING - iLO failed; data integrity may be
compromised, but continuing anyway."
echo "Ruh roh!" | mail my@xxxxxxxxxx
exit 0

Don't forget to add fence references to your cluster.conf.

(d) Buy a redundant external power switch as a cascade (or primary
fencing method) in the case that iLO is unreachable.  Here is a WTI NPS
on eBay for $125:

http://cgi.ebay.com/WTI-NPS-115-Remote-Telnet-Power-Reboot-NIB-Switch_W0QQitemZ9701395350QQcategoryZ11175QQssPageNameZWDVWQQrdZ1QQcmdZViewItem

The NPS has two power rails, and the internal electronics can run off of
either.  I.E., you can actually build a NSPF configuration with nodes
w/o redundant power supplies - without having to weaken any guarantees
about data integrity.  (Note: the NPS 115 has is past its end of life;
WTI has a replacement, but it will cost more than $125.).

-- Lon

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster