Mark Hlawatschek napisał(a): > Hi, > > I'd like to discuss and collect information about the two diffrent fencing > approaches. > > external fencing: The failed cluster node is disconnected from the storage > device by onother node in the cluster. After a failure detection all cluster > activities are suspended until the IO fencing of the failed node has been > completed successfully. > > watchdog fencing: A failed cluster node has to recognize the failure by itself > and will be shut down by a kind of internal watchdog feature. > > Now, I see that theoretically the external fencing method (when configured > correctly) is the betterer approach because of the exactly defined state > during a fencing and recovery operation. > > But the question is: What are real world examples of failures when the > watchdog fencing would fail and cause data corruption on the storage device ? > I'd like to collect some real world examples and also theoretical approaches. > > All comments welcome ! Hello, Watchdog fencing isn't good for at least two reasons: 1. Watchodog is a piece of code which run in user space, so You don't have 100% guarantee that it will run correctly. 2. Watchdog fencing can't protect You against split-brain situations, where the consequences could be corruption of You data. Here comes external fencing. There is another point of view about Linux Clusters and other Commercial Clusters(fe. Sun Cluster). Linux Cluster resist in user-space so You don't have guarantee that local fencing will run ok, and You need exteral fencing to resolve this main problem. Sun Cluster resist in kernel-space, so when one node lost quorum it do "kernel panic" and You have 100% guarantee that it will success. For me network fencing(IPMI,DRAC,...) isn't good, because You have to connect via network and it could fail, and so on. The best fencing mechanism is fence_scsi, which is an I/O fencing agent. I can be used with the SCSI devices that support persistent reservations (SPC-2 or greater). In more cases You have shares storages taht support SPC-2 or SPC-3. Best Regards Maciej Bogucki -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster