Hi All,
I have two machines (named node1 -->192.168.0.27 and node2
-->192.168.0.28) installed Red Hat Cluster Suite 4 with DLM with 1 NIC
for each machine. I have created a manual fence, a failover domain, two
services (1st service is "www - listening address is 192.168.0.111" ,
2nd service is "ftp - listening address is 192.168.0.112).
After having the initital setup, everything seems working fine, i can
relocate the service from node1 to node 2 or vice versa manually, stop
and start the services.
But when i tried to test the failover capibility, i.e. shutdown the
network service in one node e.g. shutdown the eth0 of node1, the failed
service won't work in most time, following was the scenarios i tested:
Scenario: Running services running in node1, then i shutdown the eth0 of
node1
Result: Services not failover to node2, and the clustat in node1 shows that:
Member Status: Quorate
Member Name Status
------ ---- ------
node1 Offline
node2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
ftp unkonwn started
www unkonwn started
Both services were no longer working. when i restarted the eth0 in
node1, restarted the cman service in node1, it still didn't work. Also,
when i tried to restart the rgmanager in node1, it only showed that
"Waiting for services to stop: " and wating forever. Even i tried to
kill the process of the rgmanager, it didn't work. Finally, i have to
reset both machines to get the cluster service back to normal.
I would appreciate if anyone could help or anyone can share if they also
got such experience before.
I also attached the cluster.conf below for any reference.
======cluster.conf=========
<?xml version="1.0"?>
<cluster config_version="34" name="alpha_cluster">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="node1" votes="1">
<fence>
<method name="1">
<device name="Fence"
nodename="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="node2" votes="1">
<fence>
<method name="1">
<device name="Fence"
nodename="node2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_manual" name="Fence"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="aaa" ordered="0"
restricted="0">
<failoverdomainnode name="node1"
priority="1"/>
<failoverdomainnode name="node2"
priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="192.168.0.111" monitor_link="0"/>
<script file="/etc/rc.d/init.d/httpd" name="www"/>
<script file="/etc/rc.d/init.d/vsftpd" name="ftp"/>
<ip address="192.168.0.112" monitor_link="0"/>
</resources>
<service autostart="1" domain="aaa" name="ftp"
recovery="relocate">
<ip ref="192.168.0.112"/>
<script ref="ftp"/>
</service>
<service autostart="1" domain="aaa" name="www"
recovery="relocate">
<ip ref="192.168.0.111"/>
<script ref="www"/>
</service>
</rm>
</cluster>
==========END==========
Many Thanks,
Dicky
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster