Re: disabling DLM and GFS kernel modules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The only other thing I can think of is that I started NTPd and there was likely a big time adjustment as it had not been running.

Sep 17 10:27:32 ntpd[1118]: synchronized to 206.222.28.90, stratum 2
Sep 17 15:53:38 ntpd[1118]: time reset +18217.299628 s
Sep 17 15:53:38 ntpd[1118]: kernel time sync enabled 0001
Sep 17 15:53:38 openais[4457]: [TOTEM] The token was lost in the OPERATIONAL state.
Sep 17 15:53:38 dlm_controld[4480]: cluster is down, exiting
Sep 17 15:53:38 gfs_controld[4486]: cluster is down, exiting
Sep 17 15:53:38 fenced[4474]: cluster is down, exiting
Sep 17 15:53:38 kernel: dlm: closing connection to node 1
Sep 17 15:53:48 named[8732]: *** POKED TIMER ***
Sep 17 15:53:48 named[8733]: *** POKED TIMER ***
Sep 17 15:54:04 ccsd[4437]: Unable to connect to cluster infrastructure after 30 seconds.



David Teigland wrote:
On Tue, Sep 18, 2007 at 09:34:45AM -0500, Chris Harms wrote:
It said something about an out of memory condition. This was logged just prior to where it would have panicked:

groupd[9639]: found uncontrolled kernel object rgmanager in /sys/kernel/dlm
groupd[9639]: local node must be reset to clear 1 uncontrolled instances of gfs and/or dlm openais[9625]: [CMAN ] cman killed by node 1 because we were killed by cman_tool or other application
fenced[9647]: cman_init error 0 111
dlm_controld[9653]: cman_init error 0 111
gfs_controld[9659]: cman_init error 111

These messages mean that the userspace cluster software all exited for
some unknown reason, leaving behind a dlm lockspace (in the kernel) from
rgmanager.  At this point, you needed to reboot the machine, but instead
you restarted the userspace cluster software, which rightly complained
that you hadn't rebooted the machine, and refused to do operate.

This probably doesn't help, though, because it doesn't tell us anything
about the original problem(s) you had.  The original problem(s) probably
caused the cluster software to exit the first time, and was probably
related to the runaway processes.


There were 2 runaway processes related to GFS / DLM before I tried to shut it down. We had not encountered any issues like this until now. The only changes to our setup were a superficial change to some cluster services, and an upgrade of the DRBD kernel module.

Kevin Anderson wrote:
On Mon, 2007-09-17 at 17:50 -0500, Chris Harms wrote:
Is there an easy way to disable GFS and related kernel modules if one does not need GFS? We are running the 5.1 Beta 1 version of the cluster and had a mysterious crash of the cluster suite. There were issues with the GFS and dlm modules. The kernel panicked on shutdown.

Do you have any details on the panic?

Kevin
------------------------------------------------------------------------

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux