Re: Cluster Suite v3 software watchdog

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Lon,

Thank you very much for your reply. I'll try your tips.

Now another question: is it really necessary to pass on the "nmi_watchdog=1" parameter to the kernel? Or is it enabled by default under RHELv3 ou v4?

Regards,

Celso.

Lon Hohberger escreveu:

On Wed, 2005-12-21 at 16:25 -0200, Celso K. Webber wrote:

Does anyone has had this issue before? Or am I missing any step on configuring the software watchdog feature?

Another question for the Red Hat people on the list: does this "software watchdog" works ok? I ask because it's enabled by default when you add a new member to the cluster. The Cluster Suite v3 manual tells nothing about this resource either.

Yes, it works fine.

A few things could be happening:

(1) The NMI watchdog will reboot the machine if it detects an NMI hang.
This is only a few seconds.

(2) The cluster is extremely paranoid because you are not using a
STONITH device (power controller), and it's detecting internal hangs.
Try increasing the failover time.

(3) The cluster is not getting scheduled due to system load.  See the
man page for cludb(8) about clumembd%rtp - both may help.


-- Lon

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux