On Friday 08 October 2004 18:30, Daniel McNeil wrote: > The csnap server is the best one to know what is best. It's not running, how can it know? You probably meant, the csnap agent (running in user space on every cluster node). Then I'd say, how can it know by looking only at itself? (Similar to looking in the mirror at yourself and thinking "I'm da man!" In the real world, it's normally other people who decide you're the man, and give you responsibilities.) > See my previous posting on using 2 dlm locks to allow > different priorities. A directly connected node will > be selected if there is one, otherwise one of the "other" > nodes will be selected. The csnap server can just do > the right thing. Sorry for not responsding to that. Speed of connection is only one metric. What about the speed and memory capacity of the node itself? What about the current workload of the node? And what about comparing these things to other nodes? > > > (3) Don't use the cluster-lock model. It has its shortcomings. > > > Its strengths are in its simplicity; not its flexibility. > > Actually, the DLM can be used in simple ways or very complex > ways. It is very flexible. It does have a different programming > model that takes time to get use to. As soon as you start using it in a complex way, you probably should have spent your time building the thing you're trying to approximate with the dlm. You'll end up with what you really want, with the same amount of effort, or less because you won't spend a lot of time trying to make it be something it isn't. If you find a need for global locking in your design, go ahead and use the dlm for it, but don't try to turn the dlm into a resource manager, it simply isn't. Why is it that the hammer/nail effect gets so strong in the vicinity of a dlm? > > Yes, that's the one. We need real resource management, even if it > > initially just consists of an administrator setting up config > > files. Something has to read those config files[1] and respond to > > server instance requests from csnap agents accordingly. > > > > [1] At cluster bring-up time. The resource manager has to be able > > to operate without reading files during failover. > > IMHO, a resource manager is NOT the right way to do this: > > - cluster services should avoid config files if at all possible. > If they are are not set up right, the whole cluster can get > messed up. If the cluster changes, the config files might > need to change. The config files will be your single point > of failure. From previous experience, cluster configuration > is one of the biggest sources of cluster failure, and you > won't know it until a failure -- the worst possible time. Then the configuration file should let you configure only what needs to be configuring. Anyway, what do config files have to do with needing or not needing a resource manager? > - It makes a low-level function dependent on a higher-level > function. As you say above, the resource manager has to > operate very carefully to avoid dead locking. This is > asking for trouble. Then what we want is a good, low-level resource manager. Let people interface to it and script themselves into oblivion, err, nirvana, but make the low level thing very simple and easy to audit. Regards, Daniel