On Sat, 2008-02-23 at 09:06 -0800, Jonathan Biggar wrote: > > Hardware watchdog timers are going to be more reliable than just about > > anything qdiskd could provide. > > Ok, I get it. It's probably a couple of orders of magnitude more > reliable, but since it relies only on timing, there's no real *positive* > indication that the fencing succeeded, so it's really only best-effort. > Even though it would take three failures (network disruption of > heartbeat, quorumd failing to reboot the node and the watchdog timer > failing as well), there's still a slim, slim chance that the node is > still trying to write to the SAN. If I want to guarantee that there's > never a split brain, then this isn't good enough. Correct. Although, plenty of software relies on hardware watchdog timers. While it's not externally verifiable, I would estimate the chances of a WDT failing when configured correctly are about as likely as a fence device falsely claiming success (which could happen ... in theory). The key is figuring out how to make it reliable on the software side. I think that the watchdog daemon is probably the answer (or pretty darn close). Practically speaking, when deciding to think about fencing or related technologies, it helps to enumerate the failure cases you're worried about and what component's job it is to handle each problem. For example: * Kernel panic -> Handled by WDT * System loses power -> Don't care (dead anyway) * Watchdog daemon hang -> Handled by WDT * ... * Network disconnect -> Watchdog daemon? * Cluster software hangs/crashes -> Watchdog daemon? * ... The slim cases where stuff might break generally involve the (properly configured) watchdog daemon misbehaving at the same time the node / cluster software misbehaves: * Network disconnect + watchdog daemon doesn't notice ... * Cluster software hang/crash + watchdog daemon doesn't notice ... * ... Historically (i.e. RHEL3's clumanager), we had watchdog daemon support built in to the cluster membership layer. There are advantages to this design because it solves some of the problems in a fairly concrete way (e.g. "cluster software hang" is a non-issue, since it would cause the WDT to trigger). Unfortunately, there are also some disadvantages, too, which is why it's not in the current cluster software (e.g. not being able to check routing because the membership layer is time-critical). -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster