Lon Hohberger wrote: > On Wed, 2005-12-07 at 10:08 -0500, Greg Forte wrote: > > >> <device name="FENCE1" >>option="reboot" port="1"/> >> <device name="FENCE2" >>option="reboot" port="1"/> >> >>and increased the reboot wait time on the PDUs to make sure it'd wait >>long enough, and that SEEMS to work (once I remembered to turn off ccsd >>before updating my cluster.conf by hand so that it didn't end up >>replacing it with the old one immediately ;-) > > > I don't know how I missed this, but this is a poor idea. > > What if fenced hangs in the middle? Then you haven't turned off the > power at all, but the cluster thinks you did! Goodbye, file systems! > > There's no way to guarantee that both ports were turned off > simultaneously, irrespective of the timeout values. :( > > You could do: > > <device name="FENCE1" option="off" port="1"/> > <device name="FENCE2" option="reboot" port="1"/> > <device name="FENCE1" option="on" port="1"/> > > ...but that's about as "optimal" as you can get while still being safe. Thinking about this a bit further, how is the second example any better than the first? If fenced hangs after issuing the "off" to FENCE1 in your conf, but before or during issuing the reboot to FENCE2, how is that different than it hanging between issuing the two reboots in mine? Aside from the fact that mine (in theory) leaves both power outlets on, whereas yours leaves one off, isn't the net effect that the node didn't get fenced but the cluster thinks it did? The same argument applies to then "off","off","on","on" configuration that I'd just as soon use. I guess the real ambiguity here is in this concept of "thinks" - wouldn't cman expect to get X "OK" responses from fenced, where X is the number of entries in the <fence> section, and if it didn't receive X responses then assume something was amiss? Otherwise it seems like fencing with redundant fence devices is inherently unsafe ... On a slightly related note, system-config-cluster strikes again - I started it to monitor the cluster services, and it appears to have clobbered my "illegal" fence sections that it didn't like. How would one go about controlling (restarting, disabling) cluster services without the gui? I know cman_tool allows you to check the status, but it doesn't seem to have any options for service control. Thanks. -g -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster