Hi Lon,
you mail is "music" for my ears :D
I will try your /sbin/fence_dontcare immediately.
I will, anyway, try to explain myself better because english is not
my main language.
I understand that cluster suite is also about multiple fail
protection and date integrity but our goal is having a 100% NSPOF
cluster, i dont want to be interrupted in weekends when i play my
favourite video game (WOW) just because ONE component broke and all
cluster hung :-)
Sure our hardware configuration can sustain also some multi-point
failure, but NSPOF is our mail goal
We have almost everything redundant.
Every server have dual power supplies connected to independent power
source, dual nic, internal HD are mirrored with an hot spare, 2 FC
cards to connect to a MSA 1000, with redundant controllers and
redundant power supply connected to independent power source too.
On msa1000 we have a raid 5 with hot spare.
We have all this things and it's really frustrating for us that if
active node's mainboard fails, for shout circuit or too high
temperature or some vital component failure or whatever, then all hungs.
About WTI :
In my case WTI should be useful only in case of multiple failure, for
example both network switch fails so heartbeat fails and ilo fails
too and with /sbin/fence_dontcare i will have corruption. Is this
correct ?
I will need a supplemental NIC for every server to connect to WTI,
but since WTI have only one ethernet port i will need a separate hub
or switch to connect to it , or i can connect one server to the
ethernet port and another one to the serial port? Can i manage both
serial and ethernet port ?
Matteo
--
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster