> > Is their some method for the lock to be revoked, > > Killing the agent that has it should do the job, which would be part of > stomith. There also has to be a way of giving up the lock gracefully > when a node exits the cluster voluntarily. I neglected to mention > "graceful node exit and cleanup" as another bit of infrastructure glue > still needed. > Um.. I just realized that there's a problem here. If the agent dies but the server doesn't, the lock will get revoked. While this won't interfere with the clients currently connected to the server, any new client (or client that gets disconnected) will think that there is no server, and promote it's server to master.... and data corruption will follow. As far as I can tell, the way to ensure that this doesn't happen is to have the server process take out the lock. That way the lock won't be freed unless the server process dies. Agreed? If that's the case, should the server also be responsible for contacting the agents in the appropriate service group and getting the client information? -Ben