node fails to stop when inquorate

Katriel Traum <katriel@xxxxxxxxxxxxxxxx> · Wed, 18 Oct 2006 21:38:13 +0200

Hello.

I've been seeing some strange behavior on a failed node that perhaps
some of the forum members have encountered.

A 2-node cluster with qdiskd running. Disconnecting one node from the
network causes it to be "fabric fenced", and the remaining node
continues working as expected.
When trying to restart the failed node, rgmanager's script sends it (the
rgmanager process) into zombie land, which makes the script loop forever.
The (ugly) workaround I've been using is killing the process manually
and then manually removing /var/lock/subsys/rgmanager, which causes "rc"
to skip it.

Is there a better way to restart a failed node? Shouldn't a failed node
be "hard booted" by cman?

Thanks,
-- 
Katriel Traum, PenguinIT
RHCE, CLP
Mobile: 054-6789953

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster