On Fri, 2010-12-03 at 10:10 +0000, Jankowski, Chris wrote: > This is exactly what I would like to achieve. I know which node > should stay alive - the one running my service, and it is trivial for > me to find this out directly, as I can query for its status locally on > a node. I do not have use the network. This can be used as a heuristic > for the quorum disc. > > What I am missing is how to make that into a workable whole. > Specifically the following aspects are of concern: > > 1. > I do not want the other node to be ejected from the cluster just > because it does not run the service. But the test is binary, so it > looks like it will be ejected. When a two node cluster partitions, someone has to die. > 2. > Startup time, before the service started. As no node has the service, > both will be candidates for ejection. One node will die and the other will start the service. > 3. > Service migration time. > During service migration from one node to another, there is a > transient period of time when the service is not active on either > node. If you partition during a 'relocation' operation, rgmanager will evaluate the service and start it after fencing completes. > 1. > How do I put all of this together to achieve the overall objective of > the node with the service surviving the partitioning event > uninterrupted? As it turns out, using qdiskd to do this is not the easiest thing in the world. This has to do with a variety of factors, but the biggest is that qdiskd has to make choices -before- CMAN/corosync do, so it's hard to ensure correct behavior in this particular case. The simplest thing I know of to do this is to selectively delay fencing. It's a bit of a hack (though less so than using qdiskd, as it turns out). NOTE: This agent _MUST_ be used in conjunction with a real fencing agent. Put the reference to the agent before the real fencing agent within the same method. It might look like this: #!/bin/sh me=$(hostname) service=empty1 owner=$(clustat -lfs $service | grep '^ Owner' | cut -f2 -d: ; exit ${PIPESTATUS[0]}) state=$? echo Eval $service state $state $owner if [ $state -eq 0 ] && [ "$owner" != "$me" ]; then echo Not the owner - Delaying 30 seconds sleep 30 fi exit 0 What it does is give preference to the node running the service by making the non-owner delay a bit before trying to perform real fencing operation. If the real owner is alive, it will fence first. If the service was not running before the partition, no node gets preference. If the primary driving reason for using qdiskd was to solve this problem, then you can you can avoid using qdiskd. > 2. > What is the relationship between fencing and node suicide due to > communication through quorum disk? None. Both occur. > 3. > How does the master election relate to this? It doesn't, really. To get a node to drop master, you have to turn 'reboot' off. After 'reboot' is off, a node will abdicate 'master' mode if its score drops. -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster