Cluster services stopping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm trying to figure out why my cluster services keep stopping for what
seems to be no obvious reason.  The obvious commonality between the
services being stopped are the following resources:  1 GFS file system,
1 IP address, and 1 or 2 init scripts.  The init scripts vary between
apache, tomcat, mysql, and squid.

Normally, if a process dies and a status check on the init script
returns a non-zero that event gets logged but that isn't happening when
these services are stopped.  An example of the first logged event
related to a failed service is shown below and then the service is
stopped and recovered.

"May 28 19:11:33 tf36 clurgmgrd[4418]: <notice> Stopping service twapp"

These nodes remain quite idle all of the time and have alot of
horsepower.  Some helpful information:

[smccl@tf36 log]$rpm -q rgmanager cman
rgmanager-1.9.46-0
cman-1.0.4-0

[smccl@tf36 log]$uname -osrvmpi
Linux 2.6.9-34.ELhugemem #1 SMP Wed Mar 8 00:47:12 CST 2006 i686 i686
i386 GNU/Linux

[smccl@tf36 log]$cat /etc/redhat-release 
CentOS release 4.3 (Final)

Any help is appreciated.  I can provide more information if you think it
is helpful.  Also, is there some sort of debugging within rgmanager I
can enable to see what is truly failing or timing out and requiring a
restart of these services?

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux