SAFE delivery feature request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello.

As a new member of the mail-list let me start off with
thanking a lot for this great piece of software!

Unfortunately unimplemented CPG_TYPE_SAFE seriously deter 
State Machine Replication projects, database replication,
from utilizing this type of communication mechanism.
A use case that shows the danger must be well known. Yet it would be
good to describe it here, maybe I will learn what workarounds
people found.

  Suppose the cluster consists of three nodes N1, N2 and N3.
  By some time they all delivered (totally ordered) k-1 messages.
  Suppose at that point N1 sends out its message and at once the ring splits
  into N1 and N2+N3 subrings so that the N1's message gets lost for N2+N3.
  Thanks to only available CPG_TYPE_AGREED delivery semantics N1 may
  deliver (order) the message as m_k so the application instance on N1 will process it
  to change its state, let's denote that formally as

      N1.state = apply(m_k).

  N2 + N3 application state would remain corresponding to m_k-1 message.
  But if they took *at once* on the cluster role, which they could
  'cos of being a majority of the former membership, the first message
  they might deliver would make their states inconsistent with that of N1, 'cos 

     N2.state = apply(m_k'), m_k' != m_k.

  Notice that inconsistency can't generally be mended by exchanging m_k'
  and m_k if N1 will meet N2+N3 again in a common configuration.

So to summarize the description, any quorate solution for the cluster role takeover
generally can't work. For instance the database replication deems to be
unfeasible.

As to workarounds there is just one that I see: 
when the totem ring configuration changes like above the cluster service
should be deferred until N1 is back.
It can't be counted as universal I think. At the same time SAFE delivery
should not really a challenging task, according to my reading of the
Totem protocol, as well as to a mail found

  From sdake at redhat.com  Sun Mar 11 20:18:51 2012
  From: sdake at redhat.com (Steven Dake)
  Date: Sun, 11 Mar 2012 13:18:51 -0700
  Subject:  [PATCH] drop evs service
  In-Reply-To: <1331449088-28169-1-git-send-email-fdinitto@xxxxxxxxxx>
  References: <1331449088-28169-1-git-send-email-fdinitto@xxxxxxxxxx>
  Message-ID: <4F5D08AB.1010202@xxxxxxxxxx>

  Ugh
  On 03/10/2012 11:58 PM, Fabio M. Di Nitto wrote:
  > From: "Fabio M. Di Nitto" <fdinitto at redhat.com>
  > 
  > there are several reasons for this:
  > 
  > 1) evs is only partially implemented with no plans to complete it
  > 
  > typedef enum {
  >        EVS_TYPE_UNORDERED, /* not implemented */
  >        EVS_TYPE_FIFO,          /* same as agreed */
  >        EVS_TYPE_AGREED,
  >        EVS_TYPE_SAFE           /* not implemented */
  > } evs_guarantee_t;
  > 

  We should implement safe at some point - its pretty easy to do.

With best wishes,

Andrei
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux