Re: [RFC] quorum module configuration bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/10/2012 9:14 AM, Vladislav Bogdanov wrote:
> [snip for readability just to highlight one idea]

wfm ;)

>>>
>>> Either way, internally, i don´t need to exchange the list of seen nodes
>>> because either the nodelist from corosync.conf _or_ the calculation
>>> request will tell me what to do.
>>
>> For me it is always preferred to have important statements listed
>> explicitly. Implicit ones always leave chance to be interpreted incorrectly,
>>
>> Look:
>> "You have cluster of max 8 nodes with max 10 votes, and 4 of them with 5
>> votes are known to be active. I wont say which ones, just trust me."
>>
>> "You have cluster of max 8 nodes, and nodes A, B, C, D are active. Nodes
>> E, F, G, H are not active. A and E has two votes each, all others have
>> one vote each."
>>
>> I would always prefer latter statement.
>> (This example has nothing to split-brain discussion, just an implicit
>> vs. explicit example)
>>
> [snip]
>>
>> I'd also some-how recommend that even with redundant ring cluster should
>> never be put into a "undetermined" state by powering-off old partition,
>> powering-on new one and then powering-on old one again.
>> Do not know why, but I feel that dangerous. May be my feeling is not valid.
> 
> Just to become synchronized.
> 
> Taking the example above:
> You have ABCD running, 4 nodes 5 votes. expected_votes is 5,
> higher_ever_seen is 5.

correct.

> You shutdown ABCD and then poweron EFGH. Cluster runs with 4 nodes 5
> votes. expected_votes is 5, higher_ever_seen is 5.

If the shutdown and power on are done in two distinct stages (first
complete shutdown and then poweron), then yes, that´s correct.

> You poweron A.
> 
> What would be the correct final expected_votes value?

It only depends on what A votes (you don´t say in the above example ;))

If A votes 1, then you get expected_votes: 6, higher_ever_seen: 6.
2 votes, then you get 7/7 (to state the obvious)

> It would be 7 with you approach and 10 with "seen" list

ABCD have never "seen" EFGH before but now EFGH can see A. So it´s
either 6 or 7 (based on A votes and current implementation).

But there is still an issue with the seen list when you move a bit away
from this example.

10 nodes (all votes 1)

ABCDEFGHJK

ABCDEF running.
ev:6 hes: 6

shutdown ABCDEF
(dunno why you would do that, but customers and users do the strangest
things)

poweron GHJK
ev: 4 hes: 4

poweron A
ev: 10 hes: 10 total_votes in the cluster 5 < quorum 6 -> KABOOM?

> (assuming we do
> not have leave_remove active, otherwise it may vary from 7 to 10,
> depending on order in which ABCD have left the cluster).

Let´s put aside leave_remove for now, it does not affect
highest_ever_seen as-is now and that integration bit is still missing
even from my head. Let´s see if we can come down to a correct ehs
handling, then we can take a look at integrating with other features.

> But which of them is a correct one?

I guess it´s up to us to define what is correct.

So far "seen" for me means that a certain node has seen another node
live at least once (after that I can track the state).

Fabio
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss



[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux