> I understand the above but its still not clear to me how a > locking application would get fenced. On startup the application > could check that the cluster member has joined the fence domain. > This will ensure that it gets fenced if something goes wrong. > > What's not clear is how the fence process will shut down (or > suspend) the locking application while fencing the node. Fencing > seems to be related to blocking access to I/O devices. I'm not entirely sure what you're asking, but I hope a long and broad answer might answer it. say there's a two node cluster of nodes A and B both nodes are running cman, fence, dlm and some application using the dlm 1. node A: hangs and is unresponsive 2. node B: cman detects that A has failed 3. node B: all cluster services are stopped/suspended (these services are fence and dlm in this example) 4. node B: while dlm service is stopped, it blocks all lock requests 5. node B: cluster still has quorum because of special "two_node" config 6. node B: fence service is started/enabled 7. node B: fence service fences node A 8. node B: dlm service is started/enabled 9. node B: dlm service recovers the application's lock space and lock requests proceed as usual If the fencing method in step 7 only blocks access to i/o devices from node A, node A could potentially "revive" and continue running. The dlm on node B no longer accepts A as a member of the lockspace so any dlm messages from A will be ignored by B. Depending on the application this may not be sufficient to prevent a revived node A from causing problems. If so, the simplest thing is to use a fencing method that resets the power on node A rather than simply blocking its device i/o. -- Dave Teigland <teigland@xxxxxxxxxx>