RE: Cluster Suite 4 failover problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



HI  Jeff & Lon,

Thanks for the reply.

Regarding the didn't failover issue (just displayed the "Owner --> unknown" and "State --> started" but actually none services were available), i checked the log and agreed that it should be the fence_manual problem. It is because the log message showed that the fence_manul was waiting node2 to rejoin the cluster, as soon as i executed the command: fence_ack_manual -n node2, the failed services failover to node1, all failed service back to normal.

I would like to know if there is any solution or workaround for this situation other than buying a fence device :) ????? Can i remove the fence.rpm ??? Will it cause any extra problems????? It is because in production environment, we never know when will the machine down and cannot execute the fence_ack_manual command immediately.

========/var/log/messages======
kernel: CMAN: removing node node2 from the cluster : Missed too many heartbeats
fenced[2447]: node2 not a cluster member after 0 sec post_fail_delay
fenced[2447]: fencing node "node2"
fence_manual: Node node2 needs to be reset before recovery can procede. Waiting for node2 to rejoin the cluster or for manual acknowledgement that it has been reset (i.e. fence_ack_manual -n node2)
=======END================


Regarding the monitor_link issue, i have tried to set the "monitor_link =1 " for both resource ip i.e. 192.168.0.111 and 192.168.0.112 , then i shutdown eth0 of node2 and re-enable it, when i tried to restart the rgmanager in node2 i.e. the failed node, it still showing the msg "Shutting down Cluster Service Manager... Waiting for services to stop: ", i have to kill the rgmanager's processes or even worse i have to reset the machine. Any ideas??

One more thing is even the monitor_link=0 in the cluster.conf, the system-config-cluster --> Resource --> IP address's Monitor Link box is being ticked!!! Why??

Many thanks,
Dicky

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux