On Fri, 14 Dec 2007, Roger Peña wrote:
--- gordan@xxxxxxxxxx wrote:
On Fri, 14 Dec 2007, Roger Peña wrote:
I thinks this is question #1 in the FAQs and in
this
list :-)
the short anwser and the first place to look at
is:
1- fencing not configured or configured as manual
2- fencing problems, the devices not working as
they
should
The problem is that I don't have any devices I could
do fencing with. Is
you do not have:
1- shared storage? usually, the "server" of the shared
storage have a way to cut the storage to a client, so
this can serve as a fencing device
2- what kind of server do you have? HP servers has
iLo, SUN and Dell servers have something similar. so
those interfaces can act as fencing devices
I have Dell servers, but nothing that can be used to monitor them.
I'm really only looking for something simple - if a node fails 10 pings in
a row or fails to respond to a ping in 10 seconds, kick it off. If it
rejoins (on boot-up), then it should be allowed to join.
If all nodes monitor all other nodes, and kick the ones they can't
contact, they'll either fence the dead node, or the dead node will fence
off itself if there's a NIC failure. Or if the switch fails they'll all
fence themselves off, but, in that case, so what...
there a way to achieve this without external
monitoring?
not that I know off,
but I don't want to :-), I would like to be sure that
a node with problems gets kicked from the cluster so
it did not mess things that is why I will decline to
start a cluster without at least a first level of
fencing.
Except I don't have any fail-over services per se. All nodes run all
services. If a node fails, it won't respond and the load-balancer will
just stop directing TCP traffic to it.
At the moment, I'm thinking about the fencing console in the OSR tools,
and writing a small monitoring daemon in perl to use it to kick out the
nodes that aren't responding. It's just that it'd be nice if there was
already something out there that'll do this...
Gordan
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster