Stewart Walters wrote:
carlopmart wrote:
carlopmart wrote:
Hi all,
I need to setup another rhcs today with two nodes. But every times
that I start second node, node1 returns this error:
cman killed by node 2 because we rejoined the cluster without a full
restart
.. and cman stops on node1. Why?? I didn't find any solution under
http://sources.redhat.com/cluster/wiki/FAQ/
My nodes are rhel5.3
Many thanks.
Please, I need your help ... Any ideas???
Sounds like node1 fenced node2, and node2 hasn't been rebooted since
being fenced. Either that, or node2 uses manual fencing and you haven't
yet manually acknowledged that it was rebooted.
Check your logs in /var/log/messages on node1, I'm pretty sure you'll
see a reference there that node2 has been fenced.
You'll probably also see somewhere in the logs on node1, that it
detected node2 did not leave the cluster after being fenced, and as a
result node1 itself has decided to stop itself to prevent data
corruption (the message will be something like that anyway).
If you are using manual fencing on a node2, after you reboot it you need
to run "fence_manual_ack -n <node2>" from node1. Do this only after
you've restarted node2 but before cman starts back up on it in the next
boot sequence. At this point node1 will stop fencing node2 and both
nodes should be able to join the cluster succesfully.
Manual fencing is evil :-)
Try to avoid it if you can - as you'll get this scenario on your cluster
every time a node is fenced. This is the reason why Red Hat write in
their documentation numerous times that manual fencing is not supported
in Production clusters (it's almost as if they're trying to tell us
something...). ;-)
Also, you mentioned that the solution was not found in the FAQ. While
it might not include reference to this specific symptoms, I'm pretty
sure the FAQ, the man pages for fence_manual and the RHCS documentation
from Red Hat all cover the requirements of having to manually
acknowleging nodes that use manual fencing. If you do in fact employ
manual fencing in your cluster, you might want to go over this
documentation again.
If you don't use manual fencing, please accept my apologies for
expressing my general distaste for manual fencing instead of actually
helping you!! :-)
Kind Regards,
Stewart
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
Many thanks for your help Stewart, but I don't use manual fence as fence device
in this cluster. I am using gnbd to do this.
I post my cluster.conf
--
CL Martinez
carlopmart {at} gmail {d0t} com
<?xml version="1.0"?>
<cluster alias="TestCluster" config_version="3" name="TestCluster">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="rhelnode01" nodeid="1" votes="1">
<fence>
<method name="gnbd">
<device name="gnbd-fence" nodename="rhelnode01"/>
</method>
</fence>
</clusternode>
<clusternode name="rhelnode02" nodeid="2" votes="1">
<fence>
<method name="gnbd">
<device name="gnbd-fence" nodename="rhelnode02"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman two_node="1" expected_votes="1">
<multicast addr="239.192.55.11"/>
</cman>
<fencedevices>
<fencedevice agent="fence_gnbd" name="gnbd-fence" servers="gnbdserv"/>
</fencedevices>
<rm log_facility="local4" log_level="6">
<failoverdomains>
<failoverdomain name="FullCluster1" ordered="1" restricted="1">
<failoverdomainnode name="rhelnode01" priority="1"/>
<failoverdomainnode name="rhelnode02" priority="2"/>
</failoverdomain>
<failoverdomain name="FullCluster2" ordered="1" restricted="1">
<failoverdomainnode name="rhelnode02" priority="1"/>
<failoverdomainnode name="rhelnode01" priority="2"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="172.25.50.18" monitor_link="1"/>
<ip address="172.25.50.19" monitor_link="1"/>
<ip address="172.25.50.20" monitor_link="1"/>
<ip address="172.25.50.21" monitor_link="1"/>
<ip address="172.25.50.22" monitor_link="1"/>
<ip address="172.25.50.23" monitor_link="1"/>
<ip address="172.25.50.24" monitor_link="1"/>
<ip address="172.25.50.25" monitor_link="1"/>
</resources>
</rm>
</cluster>
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster