Am Dienstag, den 01.09.2009, 12:48 +0200 schrieb Jakov Sosic: > On Tue, 01 Sep 2009 12:29:36 +0200 > "Marc - A. Dahlhaus [ Administration | Westermann GmbH ]" <mad@xxxxxx> > wrote: > > > It isn't misbehaving at all here. > > > > The job of RHCS in this case is to save your data against failure. > > > > If fenced can't fence a node successfully, RHCS will wait in stalled > > mode (because it doesn't get a successful response from the > > fence-agent) until someone who knows what he is doing comes around to > > fix up the problem. If it wouldn't do it that way a separated node > > could eat up your data. It is the job of fenced to stop all > > activities until fencing is in a working shape again. > > > > This behaviour is perfectly fine IMO... > > Isn't that the mission of quorum? For example - if you have qourum you > will run services, if you don't have quorum you won't. If there is a > qdisk and single of three nodes is missing, it can't have quorum - so > it can't run services? > > OK I understand that this is the safer way... But that's why I was > asking in the first place for a command to flag node as missing > completely, so that I can avoid all reconfigurations. Reconfiguration > while a node missing will trigger odd behavior when node comes back - > node will be fenced constantly because it has wrong config version. > > > > - You use system dependent fencing like "HP iLO" wich will be missing > > if your system is missing and no independent fencing like an > > APC PowerSwitch... > > Yes but that are the only devices I have available for fencing. So that > is the limitation of hardware, on which I don't have any influence in > this case. I already know that fence devices are my only SPOF > currently... But I can't help myself. > > > > Think about a power purge which kills booth of your PSU on a system, > > a system dependent management device would be missing from your > > network in this case leading to exactly the problem you're faced > > with. > > I will take a look if APC UPS-es have something like killpower for > certain ports, if not I will set up false manual fencing to get around > this problem. Thank you. Its actually the "APC Switched Rack PDUs" that you should look after. You can get an 8 port device for a small budget... > > Your mistake is that you started fenced in normal mode in which it > > will fence all nodes that it can't reach to get around a possible > > split-brain scenario. You need to start fenced in "clean start" > > without fencing mode (read the fenced manpage as it is documented > > there) because you know everything is right. > > Adding clean_start again presumes reconfiguring just like removing a > node and declaring cluster a two_node, and I wanted to avoid > reconfigurations... It's just a matter of starting fenced with "fenced -c" on your two nodes. No cluster.conf fiddling needed at all... Search for "start_daemon fenced" in /etc/init.d/cman and add " -c" behind it. You should remove that after your third node gets back. > Thank you very much. You're welcome. Marc -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster