carlopmart wrote:
Stewart Walters wrote:
carlopmart wrote:
carlopmart wrote:
Hi all,
I need to setup another rhcs today with two nodes. But every times
that I start second node, node1 returns this error:
cman killed by node 2 because we rejoined the cluster without a
full restart
.. and cman stops on node1. Why?? I didn't find any solution under
http://sources.redhat.com/cluster/wiki/FAQ/
My nodes are rhel5.3
Many thanks.
Please, I need your help ... Any ideas???
Sounds like node1 fenced node2, and node2 hasn't been rebooted since
being fenced. Either that, or node2 uses manual fencing and you
haven't yet manually acknowledged that it was rebooted.
Check your logs in /var/log/messages on node1, I'm pretty sure you'll
see a reference there that node2 has been fenced.
You'll probably also see somewhere in the logs on node1, that it
detected node2 did not leave the cluster after being fenced, and as a
result node1 itself has decided to stop itself to prevent data
corruption (the message will be something like that anyway).
If you are using manual fencing on a node2, after you reboot it you
need to run "fence_manual_ack -n <node2>" from node1. Do this only
after you've restarted node2 but before cman starts back up on it in
the next boot sequence. At this point node1 will stop fencing node2
and both nodes should be able to join the cluster succesfully.
Manual fencing is evil :-)
Try to avoid it if you can - as you'll get this scenario on your
cluster every time a node is fenced. This is the reason why Red Hat
write in their documentation numerous times that manual fencing is
not supported in Production clusters (it's almost as if they're
trying to tell us something...). ;-)
Also, you mentioned that the solution was not found in the FAQ.
While it might not include reference to this specific symptoms, I'm
pretty sure the FAQ, the man pages for fence_manual and the RHCS
documentation from Red Hat all cover the requirements of having to
manually acknowleging nodes that use manual fencing. If you do in
fact employ manual fencing in your cluster, you might want to go over
this documentation again.
If you don't use manual fencing, please accept my apologies for
expressing my general distaste for manual fencing instead of actually
helping you!! :-)
Kind Regards,
Stewart
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
Many thanks for your help Stewart, but I don't use manual fence as
fence device in this cluster. I am using gnbd to do this.
I post my cluster.conf
------------------------------------------------------------------------
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
Silly question then, have you actually restarted (i.e. actually
rebooted) the cluster node1?
Regards,
Stewart
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster