On Wed, 2004-10-06 at 11:27, Daniel Phillips wrote: > On Tuesday 05 October 2004 18:47, Daniel McNeil wrote: > > > The idea is, there is a service manager out there somewhere that > > > keeps track of how many instances of a service of a given type > > > currently exist, and has some way of creating new resource > > > instances if needed, or killing off extra ones. In our case, we > > > want exactly one instance of a csnap server. We need not only to > > > specify that constraint somehow and get a connection to it, but we > > > need to supply a method of starting a csnap server. So csnap-agent > > > will be a client of service manager and an agent of resource > > > manager. > > > > Why do you need a service manager for this? As Lon suggested, > > a DLM lock can provided the 1 master and the others ready > > to take over when the lock is released. > > The DLM uses the service manager. Why lather on another layer, when > really we just want to use the service manager too? > The DLM is a well known interface that has had many implementations. When Patrick sent out the Generic Kernel API it included membership and quorum interfaces which is also things that have/could have many implementations. The service manager is something new that I have not seen in other cluster implementations. Are you planning on doing a generic API for service manager as well? From my previous experience with other cluster implementations, the DLM was only dependent on membership and quorum (and cluster-wide communication). >From my perspective the service manager is the other layer. :) If you make csnap depend on the service manager, then any other cluster implementation that wanted to use csnap would have to provide the service manager functionality. > > > We won't talk to either service manager or resource manager > > > directly, but go through Lon's Magma library, which is supposed to > > > provide a nice stable api for us to work with, regardless of > > > whether particular services reside in kernel or user space, or are > > > local or remote. Lon has said that he will adapt the Magma api as > > > we go, if we break anything or run into limitations. (I suppose > > > that is why it is called Magma, it flows under pressure.) > > > > Why do we want to use Magma? At the cluster summit I thought > > that Magma was just the way to provide backward compatibility > > for the older GFS releases. Did we agree to make magma the > > API? Having csnap depend on the DLM API makes more sense to me. > > Have you looked at the dlm api? Why would we want to be directly > ioctling sockets when we could be using a library interface? I'm not > necessarily disagreeing with you, the question is: should we be using a > library for this or not? I'd think that using a library is motherhood, > though it does force us to think about the api a little harder. I've looked at libdlm.h and libdlm.so. It looks like it is the library that provides dlm_lock(), dlm_unlock() and friends. I have not reviewed all the dlm calls, but it looks about right. What am I missing? I didn't see any direct ioctls. > > > > Magma doesn't actually know anything about what we're asking it, it > > > only knows how to pass on requests to somebody who does. So we're > > > actually talking to service manager and resource manager through > > > Magma, and presumably they talk to each other as well, because > > > service manager must ask resource manager to create or kill off > > > resource instances on its behalf. > > > > What would need to be killed off? Under what circumstances? > > If the cluster shrinks,the resource manager might decide that the > population of a particular sort of server is too high and some should > be culled. Of course, having too many servers is less of a problem > than having too few, but I generally dislike "grow only" systems of any > ilk. > I agree with this usage for resource managers in general, but this does not seem to apply to the csnap server. > > > Anyway, csnap-agent is mainly going to be talking to service > > > manager through Magma, but it also needs to tell resource manager > > > about our resource, its constraints and how to set itself up as an > > > agent to create it. I don't have a clear picture of how this works > > > at the moment, and that is the point of this email. > > > > > > For example, how do we specify the service manager constraints, > > > i.e., "exactly one" in this case: before we request the instance, > > > or as part of the request, or in a configuration file somewhere? > > > > The cnap-agent to csnap-server seems like a perfect example of why we > > a cluster communication API. The csnap-agent wants to send > > information to the csnap-server and could use a highly available > > communication mechanism. > > A csnap agent never sends information to a csnap server, except to start > one locally at the request of a resource agent. > > There may a good use for a virtual synchrony-based cluster communication > api somewhere in this, but that's not it. I just starting reading through your cluster.snapshot.design.html. I was talking about the csnap client to csnap server communication. I did a quick search through the design doc and don't see what the csnap-agent is for. I'll keep reading. Daniel