rgmanager ceases to send syslog messages

"Robert Hurst" <rhurst@xxxxxxxxxxxxxxxxx> · Tue, 14 Aug 2007 14:46:39 -0400

Odd, a member node's rgmanager (clurgmgrd) stopped sending syslog messages, in particular, a 'status' message of a service it was running.  This causes us a problem, as we monitor syslog messages from a centralized server to update us of services running by nodename.

Is there a signal or event that can trigger clurgmgrd to restart its monitoring and logging of its running service?

The last instances of it running and showing 'WATSON status' follow.  Note, I realize there was an issue with this particular cluster.conf change, but those changes had nothing to do with the WATSON service, and all other nodes are still sending their 'service status' syslog messages.  Why would 'WATSON status' just stop?

Aug  6 14:38:35 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/WATSON status 

Aug  6 14:39:05 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/WATSON status

Aug  6 14:39:20 db5 ccsd[13802]: Update of cluster.conf complete (version 187 -> 188).

Aug  6 14:39:25 db5 clurgmgrd[16354]: <notice> Reconfiguring 

Aug  6 14:39:25 db5 clurgmgrd[16354]: <info> Loading Service Data 

Aug  6 14:39:25 db5 clurgmgrd[16354]: <err> Error storing ip: Duplicate 

Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Unique attribute collision. type=clusterfs attr=device value=/dev/VGCCC1/lvol0 

Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Error storing clusterfs resource 

Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Unique attribute collision. type=clusterfs attr=device value=/dev/VGCCC1/lvol1 

Aug  6 14:39:26 db5 clurgmgrd[16354]: <err> Error storing clusterfs resource 

Aug  6 14:39:26 db5 clurgmgrd[16354]: <info> Stopping changed resources. 

Aug  6 14:39:26 db5 clurgmgrd[16354]: <info> Restarting changed resources. 

Aug  6 14:39:26 db5 clurgmgrd[16354]: <info> Starting changed resources. 

Aug  6 14:39:26 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/syslogger stop

Aug  6 14:39:27 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/luci stop 

Aug  6 14:39:27 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/webmin stop

Aug  6 14:39:27 db5 clurgmgrd: [16354]: <info> Executing /etc/init.d/nagios stop

I continue to get messages from clurgmgrd, but only through Magma Event changes, i.e.:

Aug  7 16:09:03 db5 clurgmgrd[16354]: <info> Magma Event: Membership Change 

Aug  7 16:09:03 db5 clurgmgrd[16354]: <info> State change: db1 UP

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster