Re: [RFC] quorum module configuration bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 12, 2012 at 11:10:44AM +1100, Andrew Beekhof wrote:
> On Thu, Jan 12, 2012 at 3:21 AM, David Teigland <teigland@xxxxxxxxxx> wrote:
> >> I much prefer the expected_votes field to any enumeration of the nodes.
> >
> > Expect admins to keep track of what expected_votes should be?
> 
> No. Have corosync do it automatically.
> The quorum code knows how many nodes it has seen and how many votes they had.
> expected_votes is already updated internally if the current number is
> greater than what was configured, I'm only suggesting that it also be
> recorded on disk.

It does get close, but has gaps.  Some sort of recording like this may be
beneficial regardless; I'm not saying it's a bad idea by any means.
My argument is mainly: a node list is a very good thing on its own because
it makes administration sane, *and* it happens to completely solve the
expected_votes problem.  Both of those together make a node list a
no-brainer to me (and I suspect most users).

Some possible gaps:

- Nodes start up with no value (the first time, or after the saved value
is lost.)  Other options to deal with this are limited, and will have
other problems.

- I believe the way that expected_votes is updated is that it just doesn't
automatically decrease.  This means that all nodes need to be members at
once before EV will reach the correct value.  i.e. if you add a new node,
but another node is not a member, EV will not be updated.

- I can see these saved EV values becoming inconsistent, with few ways to
reconcile/fix them.

> > The alternative would be trying to remember what they all are?
> 
> Why do you need to if expected votes is set correctly?
>
> > Defining the set of nodes that compose the cluster seems like
> > a very good thing just for its own sake.

You really want an authoritative list of nodes that compose the cluster
just for the sanity of administration.  Say you go on vacation, come back,
and don't remember all the machines that are supposed to be in the
cluster.  Or say a new admin replaces you and doesn't know all the
machines you set up to be in the cluster.  How do you go about figuring
out what they all are?  There's no list of them, they may very well not
all be online.  Some may be powered down, some may have been plugged into
the wrong switch... you're helpless.  Say you think you've found them all,
but one day an old machine is powered back up and suddenly you have this
old rogue machine disrupting your cluster that you'd forgotten about.

Or, say you want to write a script to do something on all the cluster
machines, or just to start the cluster on all the nodes.  Where does your
script get a list of nodes to iterate through?  (Note, this is the basis
of the QE tests.)

> On the otherhand, I'd argue that forcing people to run
> corosync-quorumtool and to then re-add the same information to the
> config with an editor, on every existing node, when adding a new
> member is inherently error prone.

I don't understand this.  First, quorumtool won't give you a list of all
the nodes, since all nodes is not equal to all members.  Second, adding a
node should be trivial: add the new node to existing corosync.conf, scp
corosync.conf to all the nodes.


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux