On 10-11-11 04:23 AM, Gordan Bobic wrote: > Jankowski, Chris wrote: >> Digimer, >> >> 1. >> Digimer wrote: >>>>> Both partitions will try to fence the other, but the slower >>>>> will lose and get fenced before it can fence. >> >> Well, this is certainly not my experience in dealing with modern >> rack mounted or blade servers where you use iLO (on HP) or DRAC (on >> Dell). >> >> What actually happens in two node clusters is that both servers >> issue the fence request to the iLO or DRAC. It gets processed >> and *both* servers get powered off. Ouch!! Your 100% HA cluster >> becomes 100% dead cluster. > > Indeed, I've seen this, too, on a range of hardware. My quick and dirty > solution was to doctor the fencing agent to add a different sleep() on > each node, in order of survivor preference. There may be a setting in > cluster.conf that can be used to achieve the same effect, can't remember > off the top of my head. > > Gordan I've not seen such an option, though I make no claims to complete knowledge of the options available. I do know that there are pre-device fence options (that is, IPMI has a set of options that differs from DRAC, etc). So perhaps there is an option there. I am very curious to know how this scenario can happen. As I had previously understood it, this should simply not be possible. Obviously it is though... The only thing I can think of is where a fence device is external to the nodes and allows for multiple fence calls at the same time. I would expect that and fence device should terminate a node nearly instantly. If it doesn't or can't, then I would suggest that it not accept a second fence request until after the pending one completes. -- Digimer E-Mail: digimer@xxxxxxxxxxx AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster