Re: node fails to stop when inquorate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2006-10-18 at 21:38 +0200, Katriel Traum wrote:

> The (ugly) workaround I've been using is killing the process manually
> and then manually removing /var/lock/subsys/rgmanager, which causes "rc"
> to skip it.

> Is there a better way to restart a failed node? Shouldn't a failed node
> be "hard booted" by cman?

Nodes don't "know" they're fenced with fabric-level fencing; it's a
deficiency in the model itself.

The easiest thing to do is 'reboot -fn'.  A fenced node may have
outstanding buffers which never get cleaned up - so you can't "un-fence"
them until they have been rebooted anyway.

Rgmanager's child processes are probably trying to umount the a file
system that has been fenced and are stuck in disk-wait - which may be
"forever", depending on the storage configuration.

There's an patch outstanding for qdiskd which makes it reboot on loss of
score, which triggers a reboot.  However, I don't think this is your
problem.

-- Lon

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux