On Tue, 2007-11-20 at 14:06 -0800, Scott Becker wrote: > I've been pondering what I'm actually looking for. > > Each of my nodes has a public and a private NIC. Public is for serving > web pages, private is for fencing. I was desperately trying to get > fencing to work over the public network but I was faced with > reimplementing a complicated fence agent in C in order to use ssh > (supported ok by my power switches but difficult to add to the python > fence agent). > > My remaining issue is that if I lose one of my public NICs, I must > ensure that the ensuing fencing race is won by the good node and not the > bad node which thinks it's good. Not solved by quorum because I must > also make it work, 'last man standing' (starting with 3 nodes). > > So pondering, I realized that I don't really need to monitor the ability > to reach the gateway. What I need is for a public comm error to create > an event, hence I use the public nic for cluster comms. Then do > something so that the bad node doesn't fence the good nodes. > > So assuming only one real failure at a time, I'm thinking of making the > first step in the fencing method a check for pinging the gateway. That > way when a node wants to fence, it will only be able to if it's public > NIC is working, even though it's using the private nic for the rest of > the fencing. That's a pretty good + simple idea. -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster