>> Very informative post...thanks! The scenario you mentioned with a dead >> switch port (or a single unplugged network cable, or whatever) is >> something I had thought about, and I considered it to be a strike >> against >> using a crossover cable. > > How does that follow? With a switch in the middle your points of failure > are: > cable, switch, cable The potential problem with the crossover cable design is: Although the cluster communication goes over the crossover cable, the path to the switch is used for user connections to the cluster service. Suppose node 1 is active and node 2 is standby. Node 1 loses its connection to the switch for whatever reason, but node 2 doesn't. Since the heartbeat goes across the crossover cable, the nodes think nothing is wrong, so no failover occurs and the service is not reachable to users. If the service had failed over to node 2 (which can still talk to the switch), it would be reachable to users. Eliminating the crossover cable and sending the cluster traffic through the switch eliminates this problem nicely -- both nodes try to fence, but node 1 can't reach anything, so node 2 kills node 1 and the service is up on node 2. But then, of course, you have the pathological case when neither node can talk to the switch until the downed switch comes back up, and boom, they both fence each other. Maybe the monitor_link option in conjunction with the crossover-cable heartbeat will fix this. I'm in the process of setting that up right now, so I'll post back when I have a result. -Andrew L -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster