Hi Ian,
I think there is a flaw in the design. For example,
say the network card fails on machine A. Machine B detects this and tries to
fence machine A. The problem with doing it via ssh to modify iptables is that
there is no network connectivity to Machine A and hence this mechanism will
never work. What you need is a solution that works independently of the OS such
as a power switch or remote management interface such as IBM RSA II, HP iLO
etc. With fencing, the solution has to be absolute and ruthless in that, in this
example, machine B needs to be able to fence Machine A absolutely every time
there is a problem and as soon as there is a problem.
Regards
John
----- Original Message -----
Sent: Friday, April 10, 2009 1:07
AM
Subject: Fenced failing
continuously
I've been testing a newly built 2-node cluster. The cluster
resources are a virtual IP and squid, so in a node failure, the VIP would go
to the surviving node and start up Squid. I'm running a modified fencing agent
that will SSH into the failing node and firewall it off via IPtables (not my
choice).
This all works fine for graceful shutdowns, but when I do
something nasty like pulling the power cord on the node that is currently
running the service, the surviving node never assumes the service and spends
all its time trying to fire off the fence agent, which obviously will not work
because the server is completely offline. The only way I can get the surviving
node to assume the VIP and start Squid is to fence_ack_manual, which sort of
runs counter to running a cluster to begin with. The logs are filled with
Apr 12 00:01:44 <hostname> fenced[3223]: fencing node
"<otherhost>" Could not disable xx.xx.xx.xx
on 23]: agent "fence_iptables" reports: ssh: connect to host
xx.xx.xx.xx port 22: No route to host
Is this a misconfiguration, or is
there an option I can include somewhere to tell the nodes to give it up after
a certain number of tries?
-- Linux-cluster mailing
list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster
|
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster