On 09/21/2012 06:39 AM, Ralf Aumueller wrote:
On 09/20/2012 07:54 PM, Digimer wrote:
On 09/20/2012 12:21 PM, Ralf Aumueller wrote:
Hello,
we have a two node CentOS6.2 Cluster (rgmanager-3.0.12.1-5). After a reboot of
node2 the cluster won't work as expected. On node2 clustat just say's :
clustat:
Cluster Status for cluster1 @ Thu Sep 20 17:06:02 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
node1 1 Online
node2 2 Online, Local
No services listed, no rgmanager running. Also it is not possible to
start/migrate any services to node2.
On node1 a clustat lists all configured services + under Status rgmanager on
both nodes. On node1 the rgmanager.log has lots of:
rgmanager #37: Error receiving header from 2 sz=0 CTX 0x1XXXXXX
On node2 the rgmanager.log gives me:
rgmanager #34: Cannot get status for service ...
I did not change the cluster.conf. Only change on node2 was: +48MB and an new
BIOS version -- recommend by Dell Support).
Best regards,
Ralf
Sounds like you hit this bug:
http://rhn.redhat.com/errata/RHBA-2012-0897.html
Update rgmanager to rgmanager-3.0.12.1-12 and you should be ok.
Did an update of rgmanager on both nodes. Just stopping/starting the
cluster-services didn't revolve the problem. A shutdown of both nodes an then a
restart solves the problem.
Thanks and best regards,
Ralf
If I recall correctly, I had to also reboot after the update was
applied. Now that I've been able to remember better, I think this was
caused by the leep second some time back. That leep second hit a lot of
programs, and I believe this includes stuff in the kernel itself.
Glad it's resolved!
--
Digimer
Papers and Projects: https://alteeve.ca
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster