On Thu, 2006-09-07 at 00:59 -0300, Celso K. Webber wrote: > Hello friends, > > Regarding Red Hat Cluster Suite and/or GFS, could someone from Red Hat > please tell me if the use of IPMI embedded devices from the servers' > motherboards is officially certified by Red Hat? > > I'd like to have this information so that we can recommend (or not) to > customers the use of IPMI as a secure form of fencing. > > We had some bad experiences recently on some servers where only one of > the onboard NICs listened to the IPMI over LAN packets, so it appeared > to us that sometimes IPMI is not that safe as a fence device. Of course > the Cluster software will assume nothing when the fencing fails, but the > bad thing is that there is no automatic failover on this situation. It's supported, but there are a couple of caveats that you should be aware of: (a) You should, if possible, use the IPMI-enabled NIC only for IPMI traffic. At least, you should not use it for cluster communication traffic - though it is fine for service-related (e.g. rgmanager, etc.) and other traffic. That way, the IPMI-enabled port can't become a single point of failure. Here's why: If IPMI and cluster traffic are using the same NIC, then that NIC failing (or becoming disconnected) will cause the node to be evicted -- but prevent fencing, because the IPMI host will be unreachable. Similarly, on a machine with a single power supply + IPMI fencing in a cluster, the power cord becomes a SPF - if you pull the power, the host is dead and fencing cannot complete (because IPMI does not have power either!), which leads to... (b) If you do not have *both* dual power supplies and dual NICs, you need something else (in addition to IPMI) if NSPF is a requirement for your particular installation. For example, what one linux-cluster user did was add their fiber channel switch as a secondary fence device (in its own fence level). His cluster tries to fence using IPMI. Failing that, the cluster falls back to fencing via the fiber switch. (c) You often need to disable ACPI on hardware which has IPMI if you intend to use IPMI for fencing. This can vary on a per-machine basis, so you should check first. If a host does a "graceful shutdown" when you fence it via IPMI, you need to disable ACPI on that host (e.g. boot with acpi=off). The server should turn off immediately (or within 4-5 seconds, like when holding an ATX power button in to force a machine off). Hope that helps! -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster