On Tue, Oct 05, 2004 at 02:39:44PM -0700, Steven Dake wrote: > if its possible people will do it.. I think that's the point -- you can if you want. > This leads to the problem that then the APIs cannot be trusted to > deliver certain guarantees such as agreed ordering of virtual synchrony. > If the APIs cannot be trusted to deliver, for example, agreed ordering, > then nobody will use agreed ordering and we will have a mess of > two-phase commit protocols on our hands.. Or worse, systems will not > operate correctly in a distrubted fashion. Here's the point of this whole exercise: to allow /multiple/ varieties of cluster manager to live behind one API. Some cm's, like yours, would take the virtual synchrony approach with everything that implies. Other cm's may be intended for something else, not require the guarantees yours provides, and take a different course. I thought the whole motivation was to allow for many cm implementations, each with their own unique characteristics, but all exporting their function through a common kernel API. (Although I wasn't there, I heard "everyone else" agreed this was necessary.) If there's only one kernel cm (one that uses VS as you suggest), then there's no point in pursuing this "common API" idea -- there would just be "the API" exported by "the cm". This explains why we seem completely out of sync in this discussion. You obviously don't see any need for different cm implementations (and think it's a bad idea.) In theory you may be right, but I think this is mainly a practical question right now. Someone else could probably produce some technical reasons why your CM isn't what they want -- clustering is a pretty broad field and saying there's "only one right way" is a bit bold. (In theory we only need one local file system after all.) In fact, I've thought there may be enough variation among cm's that sharing an API would be impossible -- I'm still not sure. With cm's providing different functions and behavior, an "application" would obviously need to select a specific cm by name (each implementation has a unique name) to attach to and use. > virtual synchrony requires membership and messaging to be integrated to > deliver on its model. I can't think of more exotic membership systems > except perhaps intergroup. Here's an example: the lowest level cm provides basic membership/messaging. It considers any node a member as long as it can communicate with it. Now say there's a higher level membership system built above this that has a more restrictive policy on who can be a member. It takes the membership info from the lower level cm, removes the members that don't meet its criteria and exports that new list as the members. An application would have to be written, of course, to interface with one of the two cm's depending on what it needs. [This concept of layering additional features is one aspect of the common API idea that I'm not emphasizing as much as the basic concept of alternative implemenatations.] > > I think the question here is whether your messaging/membership system > > (currently in user space) would fit behind the API Patrick sent once > > ported to the kernel. If not, then what needs to be changed so it would? > > The idea is for the API to be general enough to support a variety of > > clustering modules, including yours. > > Virtual synchrony is the "one true model" for distributed computing. That may be, and if it's true I don't imagine any other cm's will exist behind this API in the long term. The question was, will this API adequately export whatever your cm provides? > Other systems just don't deliver the features that are available in > virtual synchrony. They may not deliver those features by choice simply because the features aren't necessary for what they're designed to do. > This allows us the freedom to design any sort of distributed system if > we accept virtual synchrony must exist at the lowest level. We're aiming for even more freedom -- the ability to reject even a VS-based cm. If that's a foolish idea, then alternatives will either die or never sprout up. From what I've heard, there's not a consensus on one true cm everyone will adopt. I think it's unlikely to happen any time soon which means we need to allow for different approaches. > If virtual synchrony is not enforced by the API, then people that don't > care about virtual synchrony immediately could provide implementations > that don't support those features. This would result in fragmented > implementations of clustering infrastructure which is what we are trying > to avoid. As I said earlier, I thought the whole point of this was to allow for fragmentation but to agree on an API if possible. If that's the case, then the API should probably be as permissive as possible. > Not only that, these solutions would not be reliable in partitions, > merges, or faults because they would most likely not handle these > situations in a deterministic and correct fashion. IMHO, it is > impossible to make a reliable distributed system if partitions, merges, > and faults are not addressed up front as part of the APIs and protocols. Some people may not be as interested in reliability as you and I are. I understand what you're saying and I think we'd like to use pretty much the same kernel cm in the end. The common kernel API isn't really about what /we/ want, though, it's about what other people might want to do. If it's possible to share a common API despite a diversity of implementations that would be nice -- at least that's the basis of this discussion. Your goal (to get everyone to agree on a single kernel cm) might be possible, but it'll probably take a bit of work. Getting everyone to share a common API would at least be a step in that direction. -- Dave Teigland <teigland@xxxxxxxxxx>