Re: Qdisk question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Lon and Thanks for this reply.
 
In fact, thinking about it, my test wasn't very much representative of what I was expecting to do.
 
I blocked the qdisk communications to only one node which, after reading your reply, kind of confirmed me that I did the wong test. I'm going to re run it by blocking all the nodes to the qdisk.
 
I'll also try your ping tie-breaker.
 
Brem

 
2009/8/13, Lon Hohberger <lhh@xxxxxxxxxx>:
On Thu, 2009-08-13 at 00:45 +0200, brem belguebli wrote:

> My understanding of qdisk is that it is used as a tie-breaker, but it
> looks like it is more a heatbeat vector than a simple tie-breaker.

Right, it's a secondary membership algorithm.


> Until here, no real problem indeed, if the site gets apart from the
> other prod site and also from the third site (hosting the iscsi target
> qdisk) the 2 nodes from the failing site get evicted from the cluster.
>
>
> But, what if my third site gets isolated while the 2 prod ones are
> fine ?

Qdisk votes will not be presented to CMAN any more, but the two sites
should remain online if they still have a "majority" of votes.


> The real  question is what happens in case all the nodes loose access
> to the qdisk while they're still able to see each others ?

Qdisk is just a vote like other voting mechanisms.  If all nodes lose
access at the same time, it should behave like a node death.  However,
the default action if _one_ node loses access is to kill that node (even
if CMAN still sees it).


> The 4 nodes have each 1 vote and the qdisk 1 vote. The expected quorum
> is 3.


> If I loose the qdisk, the number of votes falls to 4, the cluster is
> quorate (4>3) but it looks like everything goes bad, each node
> deactivate itself as it can't write its alive status (--> heartbeat
> vector) to the qdisk even if the network heartbeating is working
> fine.

What happens specifically?  Most of the actions qdiskd performs are
configurable.  For example, if the nodes are rebooting, you can turn
that behavior off.



I wrote a simple 'ping' tiebreaker based the behaviors in RHEL3.  It
functions in many ways in the same manner as qdiskd with respect to vote
advertisement to CMAN, but without needing a disk - maybe you would find
it useful?

http://people.redhat.com/lhh/qnet.tar.gz

-- Lon

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux