-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thanks, I'll look into it. I don't think qdiskd rebootign is a good solution for this scenario. Are there any cases in which cman reboots a machine? maybe this should be configurable (not only when qdiskd tells it to reboot) Thanks, +Katriel Lon Hohberger wrote: > On Wed, 2006-10-18 at 21:38 +0200, Katriel Traum wrote: > >> The (ugly) workaround I've been using is killing the process manually >> and then manually removing /var/lock/subsys/rgmanager, which causes "rc" >> to skip it. > >> Is there a better way to restart a failed node? Shouldn't a failed node >> be "hard booted" by cman? > > Nodes don't "know" they're fenced with fabric-level fencing; it's a > deficiency in the model itself. > > The easiest thing to do is 'reboot -fn'. A fenced node may have > outstanding buffers which never get cleaned up - so you can't "un-fence" > them until they have been rebooted anyway. > > Rgmanager's child processes are probably trying to umount the a file > system that has been fenced and are stuck in disk-wait - which may be > "forever", depending on the storage configuration. > > There's an patch outstanding for qdiskd which makes it reboot on loss of > score, which triggers a reboot. However, I don't think this is your > problem. > > -- Lon > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster - -- Katriel Traum, PenguinIT RHCE, CLP Mobile: 054-6789953 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFFNpefDWy+Hv/461sRAlwZAKCGMPfGwsFmsAd09Z0Z3Y3vxmudwQCfd+09 2oGyyKMkxpPV6SSQUH8J4jk= =rrou -----END PGP SIGNATURE----- -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster