On 2/10/2012 9:14 AM, Vladislav Bogdanov wrote: > [snip for readability just to highlight one idea] wfm ;) >>> >>> Either way, internally, i don´t need to exchange the list of seen nodes >>> because either the nodelist from corosync.conf _or_ the calculation >>> request will tell me what to do. >> >> For me it is always preferred to have important statements listed >> explicitly. Implicit ones always leave chance to be interpreted incorrectly, >> >> Look: >> "You have cluster of max 8 nodes with max 10 votes, and 4 of them with 5 >> votes are known to be active. I wont say which ones, just trust me." >> >> "You have cluster of max 8 nodes, and nodes A, B, C, D are active. Nodes >> E, F, G, H are not active. A and E has two votes each, all others have >> one vote each." >> >> I would always prefer latter statement. >> (This example has nothing to split-brain discussion, just an implicit >> vs. explicit example) >> > [snip] >> >> I'd also some-how recommend that even with redundant ring cluster should >> never be put into a "undetermined" state by powering-off old partition, >> powering-on new one and then powering-on old one again. >> Do not know why, but I feel that dangerous. May be my feeling is not valid. > > Just to become synchronized. > > Taking the example above: > You have ABCD running, 4 nodes 5 votes. expected_votes is 5, > higher_ever_seen is 5. correct. > You shutdown ABCD and then poweron EFGH. Cluster runs with 4 nodes 5 > votes. expected_votes is 5, higher_ever_seen is 5. If the shutdown and power on are done in two distinct stages (first complete shutdown and then poweron), then yes, that´s correct. > You poweron A. > > What would be the correct final expected_votes value? It only depends on what A votes (you don´t say in the above example ;)) If A votes 1, then you get expected_votes: 6, higher_ever_seen: 6. 2 votes, then you get 7/7 (to state the obvious) > It would be 7 with you approach and 10 with "seen" list ABCD have never "seen" EFGH before but now EFGH can see A. So it´s either 6 or 7 (based on A votes and current implementation). But there is still an issue with the seen list when you move a bit away from this example. 10 nodes (all votes 1) ABCDEFGHJK ABCDEF running. ev:6 hes: 6 shutdown ABCDEF (dunno why you would do that, but customers and users do the strangest things) poweron GHJK ev: 4 hes: 4 poweron A ev: 10 hes: 10 total_votes in the cluster 5 < quorum 6 -> KABOOM? > (assuming we do > not have leave_remove active, otherwise it may vary from 7 to 10, > depending on order in which ABCD have left the cluster). Let´s put aside leave_remove for now, it does not affect highest_ever_seen as-is now and that integration bit is still missing even from my head. Let´s see if we can come down to a correct ehs handling, then we can take a look at integrating with other features. > But which of them is a correct one? I guess it´s up to us to define what is correct. So far "seen" for me means that a certain node has seen another node live at least once (after that I can track the state). Fabio _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss