Re: A question about Ceph's paxos implication

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 19, 2017 at 10:37 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Fri, 19 May 2017, fisherman wrote:
>> Hi, Sage and all Cepher
>>
>>    I am reading Ceph's implementation of paxos and have a question about it.
>>    The question is given by an example below:
>>
>>    Assume there are 5 monitor nodes: n1, n2, n3, n4, n5.
>>
>> 1) Node n1 is the leader,  all nodes are synchroined with
>> Last_committed=100, and there is no pending operation;
>> 2) A client, say c1, sends a request R1 to n1;
>> 3) Node n1 proposes a value v(for R1) with log version 101, stores
>> version 101 and pending_v =101 in its db. But it goes down before
>> sending anything to other nodes;
>>    Note: only n1 has pending_v == 101.
>> 4) Node n2 becomes the leader(without n1) and the cluster become
>> active. Client c1 querys n2 for status, and the result shows R1 is
>> lost;
>> 5) Node n1 recovers and becomes leader again;
>> 6) Node n1 finds pending_v == 101 and log version 101, so R1 get
>> replicated and applied;
>> 7) Client C1 queries again, and finds R1 has been applied.
>>     ==>inconsitent with the result of 4)
>>
>> Am I right on this point?
>
> IIRC at step 4, as soon as a quorum is formed without n1, the original
> proposal from n1 is rendered obsolete.  (If it isn't explicitly
> invalidated it would also be highly likely to be implicitly as soon as the
> new quorum passed its first proposal.)
   Maybe the original proposal should be rendered obsolete in
handle_last function, after having got ack from everyone in quorum,
but I can't find the code.
   It can be invalidated by the first proposal of the new quorum. The
inconsistency problem I described only occurs when read happens before
any new proposal.

>
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux