Re: Re: Node2 kills node1 when it is booting ...

Flavio Junior <billpp@xxxxxxxxx> · Tue, 27 Jan 2009 10:43:32 -0200

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Did you disable ACPI daemon on linux ?
You need a instantly shutdown, if you system is rebooting using acpid
cluster will not detect it as a fence successful.

2c

- --

Flávio do Carmo Júnior aka waKKu

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: http://getfiregpg.org

iEYEARECAAYFAkl/AXgACgkQgyuXjr6dykvKhwCeMAMqzxWzosyC0WdQTgAMWcPh
zboAni5LKe6pVO2LHa4jndI/UEZKQGaR
=Dw+3
-----END PGP SIGNATURE-----

On Tue, Jan 27, 2009 at 8:30 AM, carlopmart <carlopmart@xxxxxxxxx> wrote:
> Stewart Walters wrote:
>>
>> carlopmart wrote:
>>>
>>> Stewart Walters wrote:
>>>>
>>>> carlopmart wrote:
>>>>>
>>>>> carlopmart wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>>  I need to setup another rhcs today with two nodes. But every times
>>>>>> that I start second node, node1 returns this error:
>>>>>>
>>>>>> cman killed by node 2 because we rejoined the cluster without a full
>>>>>> restart
>>>>>>
>>>>>>  .. and cman stops on node1. Why?? I didn't find any solution under
>>>>>> http://sources.redhat.com/cluster/wiki/FAQ/
>>>>>>
>>>>>>  My nodes are rhel5.3
>>>>>>
>>>>>>  Many thanks.
>>>>>>
>>>>>
>>>>> Please, I need your help ... Any ideas???
>>>>>
>>>>
>>>> Sounds like node1 fenced node2, and node2 hasn't been rebooted since
>>>> being fenced. Either that, or node2 uses manual fencing and you haven't yet
>>>> manually acknowledged that it was rebooted.
>>>>
>>>> Check your logs in /var/log/messages on node1, I'm pretty sure you'll
>>>> see a reference there that node2 has been fenced.
>>>>
>>>> You'll probably also see somewhere in the logs on node1, that it
>>>> detected node2 did not leave the cluster after being fenced, and as a result
>>>> node1 itself has decided to stop itself to prevent data corruption (the
>>>> message will be something like that anyway).
>>>>
>>>> If you are using manual fencing on a node2, after you reboot it you need
>>>> to run "fence_manual_ack -n <node2>" from node1.  Do this only after you've
>>>> restarted node2 but before cman starts back up on it in the next boot
>>>> sequence.  At this point node1 will stop fencing node2 and both nodes should
>>>> be able to join the cluster succesfully.
>>>>
>>>> Manual fencing is evil :-)
>>>>
>>>> Try to avoid it if you can - as you'll get this scenario on your cluster
>>>> every time a node is fenced.  This is the reason why Red Hat write in their
>>>> documentation numerous times that manual fencing is not supported in
>>>> Production clusters (it's almost as if they're trying to tell us
>>>> something...). ;-)
>>>>
>>>> Also, you mentioned that the solution was not found in the FAQ.  While
>>>> it might not include reference to this specific symptoms, I'm pretty sure
>>>> the FAQ, the man pages for fence_manual and the RHCS documentation from Red
>>>> Hat all cover the requirements of having to manually acknowleging nodes that
>>>> use manual fencing.  If you do in fact employ manual fencing in your
>>>> cluster, you might want to go over this documentation again.
>>>>
>>>> If you don't use manual fencing, please accept my apologies for
>>>> expressing my general distaste for manual fencing instead of actually
>>>> helping you!! :-)
>>>>
>>>> Kind Regards,
>>>>
>>>> Stewart
>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster@xxxxxxxxxx
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>
>>> Many thanks for your help Stewart, but I don't use manual fence as fence
>>> device in this cluster. I am using gnbd to do this.
>>>
>>> I post my cluster.conf
>>>
>>> ------------------------------------------------------------------------
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster@xxxxxxxxxx
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> Silly question then, have you actually restarted (i.e. actually rebooted)
>> the cluster node1?
>>
>> Regards,
>>
>> Stewart
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster@xxxxxxxxxx
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
> Yes, and then works, but when I need to do an ordered shutdown (first
> node1), fenced daemon on node2 doesn't stops ....
>
>
>
> --
> CL Martinez
> carlopmart {at} gmail {d0t} com
>
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster