On 01/14/2012 09:09 AM, Vladislav Bogdanov wrote: > Hi, > > 13.01.2012 21:21, Fabio M. Di Nitto wrote: > [snip] >>> + expected_votes is removed and instead auto-calculated based upon >>> quorum_votes in the node list > > Is it possible to dynamically keep track of "seen" nodes here and use > only that nodes for expected_votes calculations? > > I even have a use-case for that: > I "run" cluster consisting of max 17 nodes with UDPU, so all nodes are > listed in config. Currently only 3 nodes are powered on, because I do > not have load which requires more yet (and power is expensive in > european datacenters). When load increases I'd just power on additional > nodes and quorum expectations are recalculated automagically. > I have that implemented with corosync + pacemaker right now. Pacemaker > keeps that list of nodes and does quorum calculations correctly. And I'm > absolutely happy with that. From what I see changes being discussed will > break my happiness. Yes and no. Let me explain: votequorum already does that internally. For example: expected_votes: 3 in corosync.conf you power on your 4th node (assuming everybody votes 1 to make it simpler in this example) and expected_votes is automatically bumped to 4 on all nodes. While this is what you are asking for, there are a few corner cases where this could lead to a dangerous situation. First of all the new expected_votes is not written to disk but only retained internally to votequorum. This approach does not protects you against partitions properly. Specially at startup of the cluster. For example, out of 16 nodes, 8 are on switch A and 8 on switch B. Interlink between switches is broken. All nodes know of expected_votes: 3 from corosync.conf Both partitions of the cluster can achieve quorate status and they can create caos fencing each other, data corruption and all. Now, we agree that this is generally an admin error that doesn't notice that the interlink is down, but.. it leaves a window open for disasters. On the other side, i am not going to force users to do it differently. Current votequorum implementation allows this use case, and i am not going to enforce differently. Users should still be aware of they are asking for tho. > > It would also be great if I'm able to forcibly remove inactive node from > that "seen" list with just one command on *one* cluster node. Use case > for that is a human error when wrong node is powered on by mistake. The "seen" list within the quorum module is dynamic. As soon as you shutdown a node (either clean or whatever) and totem notices that the node goes away, that same node is removed from the quorum calculation. Your concern is probably related to the discussed nodelist, but that's up to others to decide "how" to handle add/removal of nodes. It doesn't affect the quorum module at all. Fabio > > Best, > Vladislav > >>> + votes is moved to the individual node list >> >> I will only speak for quorum: >> >> quorum itself doesn't need quorum_votes. It is optional (like David >> already mentioned). default to 1. >> >> quorum doesn't care about nodeid in general. A list of nodeid makes >> auto-tie-breaker working a bit earlier in the first cluster bootstrap >> process, but it's nothing worth going crazy for. >> >> Requiring a list is not mandatory either for quorum operations. >> >> I suggest to keep it flexible instead. Not everybody wants or need a >> nodelist (mcast/bcast). >> >> I suggest that if nodelist is available quorum uses it by default. >> If the list is not available, then we want expected_votes. >> >> If neither are available we error out, if both are available the list >> has higher priority vs expected_votes setting. >> >> I personally have no opinion on how the list is structured as long as I >> can easily reiterate through the node list and be able to find out which >> node I am in that list (specially if nodeid are not specified). >> >> Fabio >> _______________________________________________ >> discuss mailing list >> discuss@xxxxxxxxxxxx >> http://lists.corosync.org/mailman/listinfo/discuss > > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx > http://lists.corosync.org/mailman/listinfo/discuss _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss