Re: [Linux-cluster] Interfacing csnap to cluster stack

David Teigland <teigland@xxxxxxxxxx> · Thu, 7 Oct 2004 14:07:18 +0800

On Wed, Oct 06, 2004 at 08:55:27PM -0400, Daniel Phillips wrote:

> It seems to me that the notions of service and resource manager are
> entirely appropriate for the snapshot server, what I'm trying to sort
> out is how they interact.

The reason there is so much confusion is that your notions of service and
resource manager are largely incorrect.  Here's a long assortment of
comments on the topic to hopefully help correct them.  I've said or
written it all before so I'm not sure how much help it'll be repeating
myself.

- AFAIK, what I call Service Manager is the first ever of it's kind.  The
name is unfortunate because it leads people to believe it's something
very different than it really is.  If you can find something else out
there that does what SM does, please point it out to me.

- Most of the time, if you think you should use SM, you're probably wrong
and don't really know what it's for.

- SM is nothing like a Resource Manager; they are completely different
things and do not interact with each other.  If you want them to
interact you are probably still very confused.

- SM is almost exclusively designed for symmetric clustering subsystems
and almost exclusively for in-kernel use.  If you're making an exception
to one of these, you should probably rethink what you're doing and study
even more carefully the systems that SM was designed for and exactly how
they use it.

- SM was designed for only three symmetric clustering subsystems: fence
domain manager (was then in the kernel), dlm and gfs.  To this day I've
yet to see another specific use of SM that really fits well.  SM was
designed for this very specific use, not for general use.

- SM is essentially a factoring-out of code from fence domain manager, dlm
and gfs that would otherwise exist in each of them as a "glue layer"
between itself and the cluster manager.  I just saw ahead of time what was
needed and skipped the phase where all of them had their own internal
variation of SM.  Some of this service-specific embedding of the SM
function is what people have done in the past in all sorts of ways.
AFAIK, no one has had so many starkly symmetric systems at once, though,
which means they never had quite the replication of function.

- Many people who think they want to use SM really only need the info
generally available from the cluster manager itself.

- A Resource Manager does not need to use the SM.  A RM is fundamentally
about starting and monitoring system services or applications.  These
services do /not/ include the symmetric "services" related to SM (fence
domain manager, dlm and gfs).  If you think it might make sense for RM to
manage, say gfs, you are still seriously confused.

- SM is all about systems that are actively running, symmetrically, on a
specific subset of the nodes in a cluster.  The explicitly symmetric
nature of the algorithms used by these services makes managing them
tricky.  An asymmetric, client-server service or application has none of
the problems symmetric services do in this regard -- they have no need for
anything like SM.

- Asymmetric services/applications can often make use of a Resource
Manager.  Client-server systems have this fundamental HA problem because
the server is by definition a single point of failure (something absent
from symmetric systems.)  RM comes into the picture to address this
problem by monitoring the server from above and restarting the server
(possibly elsewhere) if it fails.  A prime example is NFS.  RM is able to
monitor an NFS server and start it on another machine if it fails.  NFS is
probably the model you should follow if your system is asymmetric and you
want to use RM.  Perhaps a study of how that works is in order.

- SM has nothing to do with instantiating or even monitoring software foo.
It's about keeping everyone who is already running foo in sync as to who
else is running foo.  A big part of keeping them in sync is an elaborate
method of safely getting existing foo's to all agree on a change to the
group of foo's.  Again, SM provides information to software that one way
or another is /already/ started and running.  The information it provides
is simply a list of nodeid's that indicates where the software is
currently running.

- I've gone on at length about when you /don't/ need to use SM (because
from what I've seen most people probably don't).  Here are a couple
points that illustrate when perhaps you /do/ need to use SM (as an
exercise you may want to study the fence domain manager, dlm and gfs to
see these illustrated.)

  * The same software is the same everywhere it's running.  i.e. no
    client and server parts, but possibly an integrated client/server.

  * It's important for the correct operation of the software for all the
    instances on all nodes to know exactly where all the other instances
    are running.  Any disagreement between nodes at any point in time
    while the software is running may be lethal.

    Say software foo has been running on nodes A and B for a while and
    has just been started on node C.  You see below that foo on B
    disagrees with foo on A and C about where foo is running.

    software foo on node A thinks foo is running on [ A B C ]
    software foo on node B thinks foo is running on [ A B ]
    software foo on node C thinks foo is running on [ A B C ]

    If this situation could cause foo on any of the nodes to run
    incorrectly, then you probably need to use something like SM which
    suspends foo on all nodes until they are all in agreement on where
    foo is running.

    If you have a foo-server, then you don't need something like SM because
    the foo-server can make unilateral decisions for the others to make
    things operate correctly.

- I think it's possible that a client-server-based csnap system could be
managed by SM (directly) if made to look and operate more symmetrically.
This would eliminate RM from the picture.

- To me there are two obvious, well defined and understood methods to
"clusterize" the csnap system.  One uses RM (and not SM) like NFS, the
other uses SM (and not RM) with a more symmetric looking integrated
client-server.  Using DLM locks is a third way you might solve this
problem without using SM or RM; I don't understand the details of how that
might work yet but it sounds interesting.

-- 
Dave Teigland  <teigland@xxxxxxxxxx>