Cluster Shutdown - ideas?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



One thing that cman does rather badly is a full cluster shutdown. With the RHEL4 code you would shut each node down in turn using the init scripts and found that everything hung as it lost quorum when the N/2th node went down.

With RHEL5 the init script was changed to do a "cman_tool leave remove" which tells the remaining nodes to reduce quorum to allow for the missing node(s).

I don't really like either of these solutions. The RHEL4 way is obviously a nuisance, but even the RHEL5 system is wrong IMHO. A normal node shutdown should not reduce quorum. If other nodes fail while that node is down the cluster runs the risk of a split brain due to reduced quorum.

Those of you who have worked with VMS systems know that that OS has a CLUSTER_SHUTDOWN option which causes the cluster software to wait until all nodes have reached a shutdown barrier and then allows all of them to go down at the same time. We could do this with Linux, but I'm not really sure how much use it would be, mainly because the cluster software sits at a higher level in the OS than with VMS and there is a lot more for the computer to do after the cluster software has shut down. It is an option though.

The other option is simply to set a flag (either in CMAN or locally) to tell the node or the whole cluster that everyone is being shut down. There are a few ways of doing this, the simplest is to add a flag to the cman init script (basically the opposite of what happens now in RHEL5) that causes "cman_tool leave remove". But that requires the cluster software to be shut down independently of the rest of the software thus destroying the point of ordered init scripts.

So, the flag could be an environment variable that is checked by the init script perhaps (do those get passed through?), or perhaps a flag inside cman itself that changes the "leave" behaviour to either do a "leave remove" or the synchronised cluster shutdown I mentioned earlier.

Does anyone have any preferences, ideas or other options we might consider?

Chrissie

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux