Re: More CS4 fencing fun

Matteo Catanese <m.catanese@xxxxxxxxxxxxx> · Fri, 24 Mar 2006 11:06:22 +0100

Hi Lon,
you mail is "music" for my ears :D

I will try your /sbin/fence_dontcare immediately.

I will, anyway, try to explain myself better because english is not  
my main language.

I understand that cluster suite is also about multiple fail  
protection and date integrity but our goal is having  a 100% NSPOF  
cluster, i dont want to be interrupted in weekends when i play my  
favourite video game (WOW) just because ONE component broke and all  
cluster hung :-)

Sure our hardware configuration can sustain also some multi-point  
failure, but NSPOF is our mail goal

We have almost everything redundant.

Every server have dual power supplies connected to independent power  
source, dual nic, internal HD are mirrored with an hot spare, 2 FC  
cards to connect to a MSA 1000, with redundant controllers and  
redundant power supply connected to independent power source too.

On msa1000 we have a raid 5 with hot spare.

We have all this things and it's really frustrating for us that if  
active node's  mainboard fails, for shout circuit or too high  
temperature or some vital component failure or whatever, then all hungs.

About  WTI :

In my case WTI should be useful only in case of multiple failure, for  
example both network switch fails so heartbeat fails and ilo fails  
too  and with /sbin/fence_dontcare i will have corruption. Is this  
correct ?

I will need a supplemental NIC for every server to connect to WTI,  
but since WTI have only one ethernet port i will need  a separate hub  
or switch to connect to it , or i can connect one server to the  
ethernet port and another one to the serial port? Can i manage both  
serial and ethernet port ?

Matteo

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster