Josef Whiter wrote:
You can either have redundant fence devices, or look into qdisk.
Thanks for the reply. Can you explain how qdisk would solve the
problem? It seems to me that the fencing device failing which
simultaneously causes the cluster member to fail wouldn't be affected by
qdisk.
Does qdisk have some feedback mechanism that tells the cluster that it's
ok to restart the failed services on another node without fencing being
successful? I can't see how that can work reliably and still prevent
split brain problems.
On Tue, Jan 09, 2007 at 10:50:53AM -0800, Jonathan Biggar wrote:
If we set up a cluster and use network power switches for fencing, won't
the failure of the power switch attached to a cluster member cause all
services that were running on that node to fail to migrate to other
cluster members?
This seems to happen to us in practice, because fencing the offline
member fails due to the power switch being unavailable, so rgmanager
never migrates the failed service(s) to another member.
Is there a general solution to this problem that I'm missing?
--
Jon Biggar
Levanta
jon@xxxxxxxxxxx
650-403-7252
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster