Hi! I started wondering what happens if my fence device is broken. The scenario: -a node (running a service) fails -another node notices the lost heartbeats and tries to fence the failed node -however, the fence device doesn't respond -...what now? I tried to simulate the situation with our test cluster of two HP Blade servers, using iLO fencing, by misconfiguring the fencing agent to use a wrong username to authenticate to the iLO. What happens is, the fenced on the running node tries to fence the failed node over and over again, and the service I'm trying to fail over will never leave state "Started" on node "Unknown"... that is, the cluster won't fail it over to the running node. Not good. If the active node fails, and the fence device fails at the same time - for example, if the active node is a Xen guest and the host Xen fails, or if the active node loses power because the network power switch fails or because the iLO gets confused - the service is lost. The Xen scenario doesn't even seem too far-fetched... Am I missing something? --Janne Peltonen Univ. of Helsinki mail admin -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster