Re: Cluster Suite 4 failover problem

Lon Hohberger <lhh@xxxxxxxxxx> · Thu, 19 Oct 2006 15:14:32 -0400

On Thu, 2006-10-19 at 23:31 +0800, Dicky wrote:

> Both services were no longer working. when i restarted the eth0 in 
> node1, restarted the cman service in node1, it still didn't work. Also, 
> when i tried to restart the rgmanager in node1, it only showed that 
> "Waiting for services to stop: " and wating forever. Even i tried to 
> kill the process of the rgmanager, it didn't work. Finally, i  have to 
> reset both machines to get the cluster service back to normal.

Sounds like 'fencing' isn't working.  After node2 decides node1 is dead,
you have to power off node1, then run "fence_ack_manual" on node2.  That
should let things fail over.

It looks like there's a typo in clustat, too, but I don't think that's
related :)

> ======cluster.conf=========
>                         <failoverdomain name="aaa" ordered="0" 
> restricted="0">
>                                 <failoverdomainnode name="node1" 
> priority="1"/>
>                                 <failoverdomainnode name="node2" 
> priority="1"/>
>                         </failoverdomain>

FYI, you don't need to define a failover domain if all nodes in the
cluster are equal.

-- Lon

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster