On Tue, 2006-09-05 at 07:43 -0400, danwest wrote: > What happens if the servers you are using require ACPI=on in order to > boot. For instance IBM X366 servers need ACPI set in order to boot. > With ACPI=on both nodes reboot when a fence occurs(see "both nodes off > problem" in thread below). This is not desirable, especially with > active/active clusters. Hopefully, the X366 either turns off immediately or can be configured to do so upon getting the "power off" command with ACPI enabled. If it does not, then you will need remote power control or fabric-level fencing. Here is some relevant background information. If you look at the IPMI v1.5 and v2 specifications, the instruction 0 for power control is supposed force the system to S4/S5 (soft-off) state immediately (for use in emergency situations). If you then look at the ipmitool source code, you will find that it uses the 0 instruction when you do a 'chassis power off' command. (quote, source = http://www.intel.com/design/servers/ipmi/pdf/IPMIv2_0_rev1_0_E3_markup.pdf - page 403): [3:0] - chassis control 0h = power down. Force system into soft off (S4/S45) state. This is for `emergency' management power down actions. The command does not initiate a clean shut-down of the operating system prior to powering down the system. (/quote) The reason linux-cluster often needs ACPI disabled with IPMI is because in many cases, machines which receive this "emergency power off" instruction do not appear to operate as what is stated in the IPMI specification. That is, some do a full, complete, clean shutdown when ACPI is enabled. If the shutdown never completes, fencing will never complete and the cluster will never recover. Now, not all machines behave this way. If your machine powers off immediately with ACPI enabled, then you do not need to disable ACPI. (Note: cheating by switching the acpid event for power button presses to /sbin/poweroff -fn does *not* count!) It is possible that some machines are - quite simply - twiddling the motherboard's soft power button. In that case, it is possible that those machines can also be configured to do an immediate-off in the BIOS when the power button is pressed, thereby alleviating the need for booting with ACPI disabled. There may be other ways to work around the ACPI/IPMI problem on your specific hardware; this is just an example. Booting with ACPI disabled is the general "quick fix", which works immediately for the majority of machines with IPMI - and does not require hardware-specific configuration. Booting with ACPI disabled also works for other types of integrated power management (iLO, RSA, DRAC, etc.) which often suffer the same problems. As noted by others in separate emails to this list, it would be nice if we could use the reboot operations more often - rather than "off, on" cycles in all cases. Most fencing solutions can not (as far as I know) confirm that a machine has rebooted the way it can confirm that a machine is "off" or "on". Of course, "reboot" does not suffer the theoretical "everyone off at once" problem, and it should eliminate the need boot with ACPI disabled. -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster