Re: [PATCH] flatiron cpg: Enhance downlist selection algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew,

Andrew Beekhof napsal(a):
On Thu, Jun 14, 2012 at 11:19 PM, Jan Friesse<jfriesse@xxxxxxxxxx>  wrote:
Let's say we have 2 nodes:
- node 2 is paused
- node 1 create membership (one node)
- node 2 is unpaused

Result is that node 1 downlist is selected, so it means that
from node 2 point of view, node 1 was never down.

This behaviour makes sense to me.

Although ideally node2 wouldn't get a membership event until everyone
agreed whether it was a member or not*.
Is that feasible?


I'm unsure what do you mean. Everyone in this specific case is only node 1 and node 2. When node 2 was paused, node 1 had only chance to create membership, and this was one node membership. When node 2 was unpaused, we must "simulate" all events which happened in time of it's pause. This was, creation of one node membership (only node 2). And after that, we can process new membership (node 2 + node 1). So I can say that it's true that everyone else (node 1) agreed that node 2 was not part of membership and now it is.

Also keep in mind that this patch fixes behavior which is not happening so often. Usually we have odd number of nodes AND more then 1, so 3, 5, ... and in such situation, this patch doesn't have any effect (because of test #1).

But maybe I didn't understood requirement. If so, can you please elaborate little more how do you think membership should look like?

* That or clients might need to be made more tolerant of being kicked
out of the membership list.


With 3 patches I've send, there shouldn't happen that node itself is kicked from membership (with one exception - localhost rebind what I'm working on to fix) by other nodes. It's always "other nodes left membership" on that node and on other nodes it is "that node left membership".

Regards,
  Honza


Patch solves situation by adding additional check for largest
previous membership.

So current tests are:
1) largest (previous #nodes - #nodes know to have left)
2) (then) largest previous membership
3) (and last as a tie-breaker) node with smallest nodeid

Signed-off-by: Jan Friesse<jfriesse@xxxxxxxxxx>
---
  services/cpg.c |   17 +++++++++--------
  1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/services/cpg.c b/services/cpg.c
index 7e62260..533f0c9 100644
--- a/services/cpg.c
+++ b/services/cpg.c
@@ -816,16 +816,17 @@ static struct downlist_msg* downlist_master_choose (void)
                best_members = best->old_members - best->left_nodes;
                cmp_members = cmp->old_members - cmp->left_nodes;

-               if (cmp_members<  best_members) {
-                       continue;
-               }
-               else if (cmp_members>  best_members) {
-                       best = cmp;
-               }
-               else if (cmp->sender_nodeid<  best->sender_nodeid) {
+               if (cmp_members>  best_members) {
                        best = cmp;
+               } else if (cmp_members == best_members) {
+                       if (cmp->old_members>  best->old_members) {
+                               best = cmp;
+                       } else if (cmp->old_members == best->old_members) {
+                               if (cmp->sender_nodeid<  best->sender_nodeid) {
+                                       best = cmp;
+                               }
+                       }
                }
-
        }

        assert (best != NULL);
--
1.7.1

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux